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SIGNAL PEPTIDE 

HUMAN -19 MQIELSTCFF LCLLRFCFS 

PIG MQLELSTCVF LCLLPLGFS 

MOUSE MQIALFACFF LSLFNFCSS 

******* * 



FIG. 1A 



Al DOMAIN 

HUMAN 1 ATRRYYLGAV ELSWDYMQSD LG-ELPVDAR FPPRVPKSFP FNTSWYKKT 

PIG AIRRYYLGAV ELSWDYRQSE LLRELHVDTR FPATAPGALP LGPSVLYKKT 

MOUSE AIRRYYLGAV ELSWNYIQSD LLSVLHTDSR FLPRMSTSFP FNTSIMYKKT 

********** **** ******** * * **** 

50 LFVEFTDHLF NIAKPRPPWM GLLGPTIQAE VYDTWITLK NMASHPVSLH 

VFVEFTDQLF SVARPRPPWM GLLGPTIQAE VYDTVWTLK NMASHPVSLH 

VFVEYKDQLF NIAKPRPPWM GLLGPTIWTE VHDTWITLK NMASHPVSLH 
*** * ** * ****** ******* * * **** *** ********** 

100 AVGVSYWKAS EGAEYDDQTS QREKEDDKVF PGGSHTYVWQ VLKENGPMAS 

AVGVSFWKSS EGAEYEDHTS QREKEDDKVL PGKSQTYVWQ VLKENGPTAS 

AVGVSYWKAS EGDEYEDQTS QMEKEDDKVF PGESHTYVWQ VLKENGPMAS 
***** ** * ** **** ** * ******* ** * ***** ******* ** 

150 DPLCLTYSYL SHVDLVKDLN SGLIGALLVC REGSLAKEKT QTLHKFILLF 

DPPCLTYSYL SHVDLVKDLN SGLIGALLVC REGSLTRERT QNLHEFVLLF 

DPPCLTYSYM SHVDLVKDLN SGLIGALLVC KEGSLSKERT QMLYQFVLLF 
********* ********** ********** **** * * * * * *** 

200 AVFDEGKSWH SETKNSLMQD RDAASARAWP KMHTVNGYVN RSLPGLIGCH 

AVFDEGKSWH SARNDSWTRA MDPAPARAQP AMHTVNGYVN RSLPGLIGCH 

AVFDEGKSWH SETNDSYTQS MDSASARDWP KMHTVNGYVN RSLPGLIGCH 
********** * * ***** ********* ********** 

250 RKSVYWHVIG MGTTPEVHSI FLEGHTFLVR NHRQASLEIS PITFLTAQTL 

KKSVYWHVIG MGTSPEVHSI FLEGHTFLVR HHRQASLEIS PLTFLTAQTF 

RKSVYWHVIG MGTTPEIHSI FLEGHTFFVR NHRQASLEIS PITFLTAQTL 
********* *** ** *** ******* ** ********* ********* 

APC/lXa ♦ 

300 LMDLGQFLLF CHISSHQHDG MEAYVKVDSC PEEPQLRMKN NEEAEDYDDD 

LMDLGQFLLF CHISSHHHGG MEAHVRVESC AEEPQLRRKA DE-EEDYDDN 

LIDLGQFLLF CHISSHKHDG MEAYVKVDSC PEESQWQKKN NN-EEMEDYD 
* ******** ****** * * *** ******* * * * 

Ila/Xa 

350 LTDSEMDWR FDDDNSPSFI QIR 

LYDSDMDWR LDGDDVSPFI QIR 

DDLYSEMDMF TLDYDSSPFI QIR 

* * * * * 
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A2 DOMAIN 

HUMAN 373 SVAKKHPKTW VHYIAAEEED WDYAPLVLAP DDRSYKSQYL NNGPQRIGRK 

PIG SVAKKHPKTW VHYISAEEED WDYAPAVPSP SDRSYKSLYL NSGPQRIGRK 

MOUSE SVAKKYPKTW IHYISAEEED WDYAPSVPTS DNGSYKSQYL SNGPHRIGRK 

***** **** *** ***** ***** * **** ** ** ***** 

423 YKKVRFMAYT DETFKTREAI QHESGILGPL LYGEVGDTLL IIFKNQASRP 

YKKARFVAYT DVTFKTRKAI PYESGILGPL LYGEVGDTLL IIFKNKASRP 

YKKVRFIAYT DETFKTRETI QHESGLLGPL LYGEVGDTLL IIFKNQASRP 
*** ** *** * ***** * *** **** ********** ***** **** 

A2 INHIBITOR EPITOPE 

473 YNIYPHGITD VRPLYSRRLP KGVKHLKDFP ILPGEIFKYK WTVTVEDGPT 

YNIYPHGITD VSALHPGRLL KGWKHLKDMP ILPGETFKYK WTVTVEDGPT 

YNIYPHGITD VSPLHARRLP RGIKHVKDLP IHPGEIFKYK WTVTVEDGPT 
********** * * ** ******** *** **** ********** 

F.IXa BINDING 
APC 

523 KSDPRCLTRY YSSFVNMERD LASGLIGPLL ICYKE SVDOR GNO IMSDKRN 

KSDPRCLTRY YSSSINLEKD LASGLIGPLL ICYKESVDQR GNQMMSDKRN 

KSDPRCLTRY YSSFINPERD LASGLIGPLL ICYKESVDQR GNQMMSDKRN 
********** *** * * * ********** ********** *** ****** 

573 VILFSVFDEN RSWYLTENIQ RFLPNPAGVQ LEDPEFQASN IMHSINGYVF 
VILFSVFDEN QSWYLAENIQ RFLPNPDGLQ PQDPEFQASN IMHSINGYVF 
VILFSIFDEN QSWYITENMQ RFLPNAAKTQ PQDPGFQASN IMHSINGYVF 
***** **** *** ** * ***** * ** ***** ********** 

623 DSLQLSVCLH EVAYWYILSI GAQTDFLSVF FSGYTFKHKM VYEDTLTLFP 

DSLQLSVCLH EVAYWYILSV GAQTDFLSVF FSGYTFKHKM VYEDTLTLFP 

DSLELTVCLH EVAYWHILSV GAQTDFLSIF FSGYTFKHKM VYEDTLTLFP 
*** * **** ***** *** ******** * ********** ********** 

673 FSGETVFMSM ENPGLWILGC HNSDFRNRGM TALLKVSSCD KNTGDYYEDS 

FSGETVFMSM ENPGLWVLGC HNSDLRNRGM TALLKVYSCD RDIGDYYDNT 

FSGETVFMSM ENPGLWVLGC HNSDFRKRGM TALLKVSSCD KSTSDYYEEI 
********** ********** **** * *** ********** * *** 

♦ Ila/Xa/APC 

723 YEDISAYLLS KNNAIEPR 

YEDIPGFLLS GKNVIEPR 

YEDIPTQLVN ENNVIDPR 
**** * * * ** 
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B DOMAIN 

HUMAN 741 SFSQNSRHPS TRQKQFNATT IPENDIEKTD PWFAHRTPMP KIQNVSSSDL 

PIG SFAQNSRPPS ASQKQFQTIT SPEDDVE-LD PQSGERTQAL EELSVPSGDG 

MOUSE SFFQNTNHPN TRKKKFKDST IPKNDMEKIE PQFEEIAEML KVQSVSVSDM 

***** ******** ** 

791 LMLLRQS-PT PHGLSLSDLQ EAKYETFSDD PSPGAIDSNN SLSEMTHFRP 

SMLLGQN-PA PHGSSSSDLQ EARNEA--DD YLPGARERNT APSAAARLRP 

LMLLGQSHPT PHGLFLSDGQ EAIYEAIHDD HSPNAIDSNE GPSKVTQLRP 
*** * *** ***** ** * * * * ** 

840 QLHHSGDMVF TPESGLQLRL NEKLGTTAAT ELKKLDFKVS ST-SNNLIS- 

ELHHSAERVL TPEP EK ELKKLDSKMS SSSDLLKTSP 

ESHHSEKIVF TPQPGLQLRS NKSLETTIEV KWKKLGLQVS SLPSNLMTT- 
*** * ** *** * * 

888 TIPSDNLAAGT DNTSSLGPPS MPVHYDSQLD TTLFGKKSSP LTESGGPLSL 

TIPSDTLSAET ERTHSLGPPH PQVNFRSQLG AIVLGKNSSH FIGAGVPLGS 

TILSDNLKATF EKTDSSGFPD MPVHSSSKLS TTAFGKKAYS LVGSHVPLNA 
****** * * * * * * * ** ** 

939 SEENNDSKLL ESGLMNSQES SWGKNVSSTE SGRLFKGKRA HGPALLTKDN 

TEED HES SLGENVSPVE SDGIFEKERA HGPASLTKDD 

SEENSDSNIL DSTLMYSQES LPRDNILSIE NDRLLREKRF HGIALLTKDN 
** ** * * ******** 

989 ALFKVSISLL KTNKTSNNSA TNRKTHIDGP SLLIENSPSV WQNILESDTE 

VLFKVNISLV KTNKARVYLK TNRKIHIDDA ALLTENRAS- 

TLFKDNVSLM KTNKTYNHST TNEKLHTESP TSIENSTTDL QDAILKVNSE 
*** ** **** ** * * 

1039 FKKVTPLIHD RMLMDKNATA LRLNHMSNKT TSSKNMEMVQ QKKEGPIPPD 

ATFMDKNTTA SGLNHVSN- - 

IQEVTALIHD GTLLGKNSTY LRLNHMLNRT TSTKNKDIFH RKDEDPIPQD 
* *** ** *** * 

1089 AQNPDMSFFK MLFLPESARW IQRTHGKNSL NSGQGPSPKQ LVSLGPEKSV 

W IKGPLGKNPL SSERGPSPEL LTSSGSGKSV 

EENTIMPFSK MLFLSESSNW FKKTNGNNSL NSEQEHSPKQ LVYLMFKKYV 

* **** *** ** 

1139 EGQNFLSEKN KVWGKGEFT KDVGLKEMVF PSSRNLFLTN LDNLHENNTH 

KGQSSGQGRI RVAVEEEELS KG KEMML PNSELTFLTN SADVQGNDTH 

KNQSFLSEKN KVTVEQDGFT KNIGLKDMAF PHNMSIFLTT LSNVHENGRH 
* * * * * * *** * * 

1189 NQEKKIQEEI EKKETLIQEN WLPQIHTVT GTKNFMKNLF LLSTRQNVEG 

SQGKKSREEM ERREKLVQEK VDLPQVYTAT GTKNFLRNIF HQSTEPSVEG 

NQEKNIQEEI EK-EALIEEK WLPQVHEAT GSKNFLKDIL ILGTRQNI-- 
* * ***** *** * * *** 

1239 SYDGAYAPVL QDFRSLNDST NRTKKHTAHF SK--KGEEEN LEGLGNQTKQ 

FDGGSHAPVP QDSRSLNDSA ERAETHIAHF SAIR--EEAP LEAPGNRT- - 

SLYEVHVPVL QNITSINNST NTVQIHMEHF FKRRKDKETN SEGLVNKTRE 
****** *** * ** 
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1287 IVEKYACTTR ISPNTSQQNF VTQRSKRALK QFRLPLEETE LEKRIIVDDT 

GPGPRSA VPRRVKQSLK QIRLPLEEIK PERGWLNAT 

MVKNYP SQKNI TTQRSKRALG QFRL 

1337 STQWSKNMKH LTPSTLTQID YNEKEKGAIT QSPLSDCLTR SHSIPQANRS 

STRWS 

STQWLKTINC STQCIIKQID HSKEMKKFIT KSSLSDS-SV IKSTTQTNSS 
** * 

1387 PLPIAKVSSF PSIRPIYLTR VLFQDNSSHL PAASY R KKDSGVQESS 

ESS 

DSHIVKTSAF P PIDLKR SPFQNKFSHV QASSYIYDFK TKSSRIQESN 

** 

1433 HFLQGAKKNN LSLAILTLEM TGDQREVGSL GTSATNSVTY KKVENTVLPK 

PILQGAKRNN LSLPFLTLEM AGGQGKISAL GKSAAGPLAS GKLEKAVLSS 

NFLKETKINN PSLAILPWNM FIDQGKFTSP GKSNTNSVTY KKRENIIFLK 
******** * ** ** 

1483 PDLPKTSGKV ELLPKVHIYQ KDLFPTETSN GSPGHLDLVE GSLLQGTEGA 

AGLSEASGKA EFLPKVRVHR EDLLPQKTSN VSCAHGDLGQ EIFLQKTRGP 

PTLPEESGKI ELLPQVSIQE EEILPTETSH GSPGHLNLMK EVFLQKIQGP 
*** * ** * * * * * *** * 

1533 IKWNEANRPG KVPFLRVATE SSAKTPSKLL DPLAWDNHYG TQIPKEEWKS 

VNLNKVNRPG RTPSKLL G PPMPKE-WES 

TKWNKAKRHG ESIKGKTES- -SKNTRSKLL NHHAWDYHYA AQIPKDMWKS 
* * * * **** * * * 

1583 QEKSPEKTAF KKKDTI-LSLN ACESNHAIAA INEGQNKPEI EVTWAKQGRT 

LEKSPKSTAL RTKDIISLPLD RHESNHSIAA KNEGQAETQR EAAWTKQGGP 

KEKSPEIISI KQEDTI-LSLR PHGNSHSIGA -NEKQNWPQR ETTWVKQGQT 
**** * * ****** * *** 

1633 ERLCSONPPY LKRHQR 

GRLCAPKPPV LRRHQR 

QRTCSQIPPV LKRHQR 
** ******** 
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LIGHT CHAIN ACTIVATION PEPTIDE 

♦ ♦ Ila/Xa 

HUMAN 1649 E I TRTTLQ SDQE E I DYDDT I S VEMKKED FD I YDEDENQS PR 

PIG DI SLPTFQPEEDKMDYDDI FSTETKGEDFD I YGEDENQDPR 

MOUSE EL--SAFQSEQEATDYDDAITIET-IEDFDIYSEDIKQGPR 
* * **** * ****** ** * ** 

FIG. 1E 



Title: Nucleic Acid and Amino Acid Sequences Encoding High-Level 

Expressor Factor VIII Polypeptides and Methods of Use 

Inventor(s): Lollar 

Application No: To be Assigned 

AttyDktNo: 007157/276516 

" 5/33 



I 

A3 DOMAIN 

IXa Xa 

HUMAN 1690 SFQKKTRHYF IAAVERLWDY GMSSSPHVLR NRAQSGSVPQ FKKWFQEFT 

PIG SFQKRTRHYF IAAVEQLWDY GMSESPRALR NRAQNGEVPR FKKWFREFA 

MOUSE SVQQKTRHYF IAAVERLWDY GMSTS-HVLR NRYQSDNVPQ FKKWFQEFT 

* * ***** ***** **** *** * ***** ** ****** ** 

1740 DGSFTQPLYR GELNEHLGLL GPYIRAEVED NIMVTFRNQA SRPYSFYSSL 

DGSFTQPSYR GELNKHLGLL GPYIRAEVED NIMVTFKNQA SRPYSFYSSL 

DGSFSQPLYR GELNEHLGLL GPYIRAEVED NIMVTFKNQA SRPYSFYSSL 
**** ** ** **** ***** ********** ****** *** ********** 

FACTOR IXa BINDING 

1790 ISYEEDQROG AEPRKNF VKP NETKTYFWKV OHHMAPTKDE FDCKAWAYFS 

ISYPDDQEQG AEPRHNFVQP NETRTYFWKV QHHMAPTEDE FDCKAWAYFS 

ISYKEDQR-G EEPRRNFVKP NETKIYFWKV QHHMAPTEDE FDCKAWAYFS 
*** ** * *** *** * *** ***** ********** ********** 

1840 DVDLEKDVHS GLIGPLLVCH TNTLNPAHGR QVTVQEFALF FTIFDETKSW 

DVDLEKDVHS GLIGPLLICR ANTLNAAHGR QVTVQEFALF FTIFDETKSW 

DVDLERDMHS GLIGPLLICH ANTLNPAHGR QVSVQEFALL FTIFDETKSW 
***** * ** ******* * **** **** ** ****** ********** 

1890 YFTENMERNC RAPCNIQMED PTFKENYRFH AINGYIMDTL PGLVMAQDQR 

YFTENVERNC RAPCHLQMED PTLKENYRFH AINGYVMDTL PGLVMAQNQR 

YFTENVKRNC KTPCNFQMED PTLKENYRFH AINGYVMDTL PGLVMAQDQR 
***** *** ** **** ** ******* ********** ******* ** 

1940 IRWYLLSMGS NENIHSIHFS GHVFTVRKKE EYKMALYNLY PGVFETVEML 

IRWYLLSMGS NENIHSIHFS GHVFSVRKKE EYKMAVYNLY PGVFETVEML 

IRWYLLSMGN NENIQSIHFS GHVFTVRKKE EYKMAVYNLY PGVFETLEMI 
********* **** ***** **** ***** ***** **** ****** ** 

PROTEIN C BINDING 

1990 PSKAGIWRVE CLIGEHL HAG MSTLFLV YSN 

PSKVGIWRIE CLIGEHLQAG MSTTFLVYSK 

PSRAGIWRVE CLIGEHLQAG MSTLFLVYSK 
** **** * ******* ** *** ****** 
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CI DOMAIN 

HUMAN 2020 KCQTPLGMAS GHIRDFQITA SGQYGQWAPK LARLHYSGSI NAWSTKEPFS 

PIG ECQAPLGMAS GRIRDFQITA SGQYGQWAPK LARLHYSGSI NAWSTKDPHS 

MOUSE QCQIPLGMAS GSIRDFQITA SGHYGQWAPN LARLHYSGSI NAWSTKEPFS 

** ****** * ******** ** ****** ********** ****** * * 

2070 WIKVDLLAPM IIHGIKTQGA RQKFSSLYIS QFIIMYSLDG KKWQTYRGNS 
WIKVDLLAPM IIHGIMTQGA RQKFSSLYIS QFIIMYSLDG RNWQSYRGNS 
WIKVDLLAPM IVHGIKTQGA RQKFSSLYIS QFIIMYSLDG KKWLSYQGNS 
********** * *** **** ********** ********** * * *** 

2120 TGTLMVFFGN VDSSGIKHNI FNPPIIARYI RLHPTHYSIR STLRMELMGCDLN 
TGTLMVFFGN VDASGIKHNI FNPPIVARYI RLHPTHYSIR STLRMELMGCDLN 
TGTLMVFFGN VDSSGIKHNS FNPPIIARYI RLHPTHSSIR STLRMELMGCDLN 
********** ** ****** ***** **** ****** *** ************* 

FIG. 1G 



C2 DOMAIN INHIBITOR EPITOPE 

HUMAN 2173 SCSMPLGM ES KAISDAQITA SSYFTNMFAT WSPSKARLHL OGRSNAWRPO 
PIG SCSMPLGMQN KAISDSQITA SSHLSNIFAT WSPSQARLHL QGRTNAWRPR 

MOUSE SCSIPLGMES KVISDTQITA SSYFTNMFAT WSPSQARLHL QGRTNAWRPQ 

*** **** * *** **** ** * *** **** ***** *** ***** 

C2 

2223 VNNPKEWLOV DFOKTMKVTG VT TQGVJCSLL TSMYVKEFLI SSSQDGHQWT 

VSSAEEWLQV DLQKTVKVTG ITTQGVKSLL SSMYVKEFLV SSSQDGRRWT 

VNDPKQWLQV DLQKTMKVTG IITQGVKSLF TMSFVKEFLI SSSQDGHHWT 
* **** * *** **** ******** ** ***** ****** ** 

PHOSPHOLIPID 

2273 LFFQNGKVKV FQGNQDSFTP WNSLDPPLL TRYLRIHPOS WVHOIALRME 

LFLQDGHTKV FQGNQDSSTP WNALDPPLF TRYLRIHPTS WAQHIALRLE 

QILYNGKVKV FQGNQDSSTP MMNSLDPPLL TRYLRIHPQI WEHQIALRLE 
* ************ ***** ******** * ***** 

BINDING 
2323 VLGCEAODLY 
VLGCEAQDLY 
ILGCEAQQQY 
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j] HUMAN 
I PORCINE 



HSQ(HHHHH) 

m p/ol(ppppp) 

HP44/OMPPPHH) 
HP47/OL(PPPHH) 
HP46/SQ(PHHHH) 
HP1/SQ(HPHHH) 
HP30/OL(HHPHH) 
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AMINO ACID SEQUENCE OF HP44/OL 

1 MQLELSTCVF LCLLPLGFSA IRRYYLGAVE LSWDYRQSEL LRELHVDTRF 

51 PATAPGALPL GPSVLYKKTV FVEFTDQLFS VARPRPPWMG LLGPTIQAEV 

101 YDTVWTLKN MASHPVSLHA VGVSFWKSSE GAEYEDHTSQ REKEDDKVLP 

151 GKSQTYVWQV LKENGPTASD PPCLTYSYLS HVDLVKDLNS GLIGALLVCR 

201 EGSLTRERTQ NLHEFVLLFA VFDEGKSWHS ARNDSWTRAM DPAPARAQPA 

251 MHTVNGYVNR SLPGLIGCHK KSVYWHVIGM GTSPEVHSIF LEGHTFLVRH 

301 HRQASLEISP LTFLTAQTFL MDLGQFLLFC HISSHHHGGM EAHVRVESCA 

351 EEPQLRRKAD EEEDYDDNLY DSDMDWRLD GDDVSPFIQI RSVAKKHPKT 

401 WVHYISAEEE DWDYAPAVPS PSDRSYKSLY LNSGPQRIGR KYKKARFVAY 

451 TDVTFKTRKA IPYESGILGP LLYGEVGDTL LIIFKNKASR PYNIYPHGIT 

501 DVSALHPGRL LKGWKHLKDM PILPGETFKY KWTVTVEDGP TKSDPRCLTR 

551 YYSSSINLEK DLASGLIGPL LICYKESVDQ RGNQMMSDKR NVILFSVFDE 

601 NQSWYLAENI QRFLPNPDGL QPQDPEFQAS NIMHSINGYV FDSLQLSVCL 

651 HEVAYWYILS VGAQTDFLSV FFSGYTFKHK MVYEDTLTLF PFSGETVFMS 

701 MENPGLWVLG CHNSDLRNRG MTALLKVYSC DRDIGDYYDN TYEDIPGFLL 

751 SGKNVIEPRS FAQNSRPPSA SAPKPPVLRR HQRDISLPTF QPEEDKMDYD 

801 DIFSTETKGE DFDIYGEDEN QDPRSFQKRT RHYFIAAVEQ LWDYGMSESP 

851 RALRNRAQNG EVPRFKKWF REFADGSFTQ PSYRGELNKH LGLLGPYIRA 

901 EVEDNIMVTF KNQASRPYSF YSSLISYPDD QEQGAEPRHN FVQPNETRTY 

951 FWKVQHHMAP TEDEFDCKAW AWFSDVDLEK DVHSGLIGPL LICRANTLNA 

1001 AHGRQVTVQE FALFFTIFDE TKSWYFTENV ERNCRAPCHL QMEDPTLKEN 
1051 YRFHAINGYV MDTLPGLVMA QNQRIRWYLL SMGSNENIHS IHFSGHVFSV 

1101 RKKEEYKMAV YNLYPGVFET VEMLPSKVGI WRIECLIGEH LQAGMSTTFL 

1151 VYSKKCQTPL GMASGHIRDF QITASGQYGQ WAPKLARLHY SGSINAWSTK 

12 01 EPFSWIKVDL LAPMIIHGIK TQGARQKFSS LYISQFIIMY SLDGKKWQTY 
1251 RGNSTGTLMV FFGNVDSSGI KHNIFNPPII ARYIRLHPTH YSIRSTLRME 

13 01 LMGCDLNSCS MPLGMESKAI SDAQITASSY FTNMFATWSP SKARLHLQGR 
1351 SNAWRPQVNN PKEWLQVDFQ KTMKVTGVTT QGVKSLLTSM YVKEFLISSS 
1401 QDGHQWTLFF QNGKVKVFQG NQDSFTPWN SLDPPLLTRY LRIHPQSWVH 
1451 QIALRMEVLG CEAQDLY* 

1-19 SIGNAL PEPTIDE 
20-391 Al DOMAIN 
392-759 A2 DOMAIN 
760-783 OL LINKER 
784-1154 ap-A3 
1155-1307 CI DOMAIN 
1308-1467 C2 DOMAIN 
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HP44/OL NUCLEOTIDE SEQUENCE 

1 ATGCAGCTAG AGCTCTCCAC CTGTGTCTTT CTGTGTCTCT TGCCACTCGG 

TACGTCGATC TCGAGAGGTG GACACAGAAA GACACAGAGA ACGGTGAGCC 
51 CTTTAGTGCC ATCAGGAGAT ACTACCTGGG CGCAGTGGAA CTGTCCTGGG 

GAAATCACGG TAGTCCTCTA TGATGGACCC GCGTCACCTT GACAGGACCC 
101 ACTACCGGCA AAGTGAACTC CTCCGTGAGC TGCACGTGGA CACCAGATTT 

TGATGGCCGT TTCACTTGAG GAGGCACTCG ACGTGCACCT GTGGTCTAAA 
151 CCTGCTACAG CGCCAGGAGC TCTTCCGTTG GGCCCGTCAG TCCTGTACAA 

GGACGATGTC GCGGTCCTCG AGAAGGCAAC CCGGGCAGTC AGGACATGTT 
201 AAAGACTGTG TTCGTAGAGT TCACGGATCA ACTTTTCAGC GTTGCCAGGC 

TTTCTGACAC AAGCATCTCA AGTGCCTAGT TGAAAAGTCG CAACGGTCCG 
251 CCAGGCCACC ATGGATGGGT CTGCTGGGTC CTACCATCCA GGCTGAGGTT 

GGTCCGGTGG TACCTACCCA GACGACCCAG GATGGTAGGT CCGACTCCAA 
301 TACGACACGG TGGTCGTTAC CCTGAAGAAC ATGGCTTCTC ATCCCGTTAG 

ATGCTGTGCC ACCAGCAATG GGACTTCTTG TACCGAAGAG TAGGGCAATC 
351 TCTTCACGCT GTCGGCGTCT CCTTCTGGAA ATCTTCCGAA GGCGCTGAAT 

AGAAGTGCGA CAGCCGCAGA GGAAGACCTT TAGAAGGCTT CCGCGACTTA 
401 ATGAGGATCA CACCAGCCAA AGGGAGAAGG AAGACGATAA AGTCCTTCCC 

TACTCCTAGT GTGGTCGGTT TCCCTCTTCC TTCTGCTATT TCAGGAAGGG 
451 GGTAAAAGCC AAACCTACGT CTGGCAGGTC CTGAAAGAAA ATGGTCCAAC 

CCATTTTCGG TTTGGATGCA CACCGTCCAG GACTTTCTTT TACCAGGTTG 
501 AGCCTCTGAC CCACCATGTC TTACCTACTC ATACCTGTCT CACGTGGACC 

TCGGAGACTG GGTGGTACAG AATGGATGAG TATGGACAGA GTGCACCTGG 
551 TGGTGAAAGA CCTGAATTCG GGCCTCATTG GAGCCCTGCT GGTTTGTAGA 

ACCACTTTCT GGACTTAAGC CCGGAGTAAC CTCGGGACGA CCAAACATCT 
601 GAAGGGAGTC TGACCAGAGA AAGGACCCAG AACCTGCACG AATTTGTACT 

CTTCCCTCAG ACTGGTCTCT TTCCTGGGTC TTGGACGTGC TTAAACATGA 
651 ACTTTTTGCT GTCTTTGATG AAGGGAAAAG TTGGCACTCA GCAAGAAATG 

TGAAAAACGA CAGAAACTAC TTCCCTTTTC AACCGTGAGT CGTTCTTTAC 
701 ACTCCTGGAC ACGGGCCATG GATCCCGCAC CTGCCAGGGC CCAGCCTGCA 

TGAGGACCTG TGCCCGGTAC CTAGGGCGTG GACGGTCCCG GGTCGGACGT 
751 ATGCACACAG TCAATGGCTA TGTCAACAGG TCTCTGCCAG GTCTGATCGG 

TACGTGTGTC AGTTACCGAT ACAGTTGTCC AGAGACGGTC CAGACTAGCC 
801 ATGTCATAAG AAATCAGTCT ACTGGCACGT GATTGGAATG GGCACCAGCC 

TACAGTATTC TTTAGTCAGA TGACCGTGCA CTAACCTTAC CCGTGGTCGG 
851 CGGAAGTGCA CTCCATTTTT CTTGAAGGCC ACACGTTTCT CGTGAGGCAC 

GCCTTCACGT GAGGTAAAAA GAACTTCCGG TGTGCAAAGA GCACTCCGTG 
901 CATCGCCAGG CTTCCTTGGA GATCTCGCCA CTAACTTTCC TCACTGCTCA 

GTAGCGGTCC GAAGGAACCT CTAGAGCGGT GATTGAAAGG AGTGACGAGT 
951 GACATTCCTG ATGGACCTTG GCCAGTTCCT ACTGTTTTGT CATATCTCTT 

CTGTAAGGAC TACCTGGAAC CGGTCAAGGA TGACAAAACA GTATAGAGAA 
1001 CCCACCACCA TGGTGGCATG GAGGCTCACG TCAGAGTAGA AAGCTGCGCC 

GGGTGGTGGT ACCACCGTAC CTCCGAGTGC AGTCTCATCT TTCGACGCGG 
1051 GAGGAGCCCC AGCTGCGGAG GAAAGCTGAT GAAGAGGAAG ATTATGATGA 

CTCCTCGGGG TCGACGCCTC CTTTCGACTA CTTCTCCTTC TAATACTACT 
1101 CAATTTGTAC GACTCGGACA TGGACGTGGT CCGGCTCGAT GGTGACGACG 

GTTAAACATG CTGAGCCTGT ACCTGCACCA GGCCGAGCTA CCACTGCTGC 
1151 TGTCTCCCTT TATCCAAATC CGCTCGGTTG CCAAGAAGCA TCCCAAAACC 

ACAGAGGGAA ATAGGTTTAG GCGAGCCAAC GGTTCTTCGT AGGGTTTTGG 
1201 TGGGTGCACT ACATCTCTGC AGAGGAGGAG GACTGGGACT ACGCCCCCGC 

ACCCACGTGA TGTAGAGACG TCTCCTCCTC CTGACCCTGA TGCGGGGGCG 
1251 GGTCCCCAGC CCCAGTGACA GAAGTTATAA AAGTCTCTAC TTGAACAGTG 

CCAGGGGTCG GGGTCACTGT CGGCAATATT TTCAGAGATG AACTTGTCAC 
1301 GTCCTCAGCG AATTGGTAGG AAATACAAAA AAGCTCGATT CGTCGCTTAC 

CAGGAGTCGC TTAACCATCC TTTATGTTTT TTCGAGCTAA GCAGCGAATG 
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1351 ACGGATGTAA CATTTAAGAC TCGTAAAGCT ATTCCGTATG AATCAGGAAT 

TGCCTACATT GTAAATTCTG AGCATTTCGA TAAGGCATAC TTAGTCCTTA 
1401 CCTGGGACCT TTACTTTATG GAGAAGTTGG AGACACACTT TTGATTATAT 

GGACCCTGGA AATGAAATAC CTCTTCAACC TCTGTGTGAA AACTAATATA 
1451 TTAAGAATAA AGCGAGCCGA CCATATAACA TCTACCCTCA TGGAATCACT 

AATTCTTATT TCGCTCGGCT GGTATATTGT AGATGGGAGT ACCTTAGTGA 
1501 GATGTCAGCG CTTTGCACCC AGGGAGACTT CTAAAAGGTT GGAAACATTT 

CTACAGTCGC GAAACGTGGG TCCCTCTGAA GATTTTCCAA CCTTTGTAAA 
1551 GAAAGACATG CCAATTCTGC CAGGAGAGAC TTTCAAGTAT AAATGGACAG 

CTTTCTGTAC GGTTAAGACG GTCCTCTCTG AAAGTTCATA TTTACCTGTC 
1601 TGACTGTGGA AGATGGGCCA ACCAAGTCCG ATCCTCGGTG CCTGACCCGC 

ACTGACACCT TCTACCCGGT TGGTTCAGGC TAGGAGCCAC GGACTGGGCG 
1651 TACTACTCGA GCTCCATTAA TCTAGAGAAA GATCTGGCTT CGGGACTCAT 

ATGATGAGCT CGAGGTAATT AGATCTCTTT CTAGACCGAA GCCCTGAGTA 
1701 TGGCCCTCTC CTCATCTGCT ACAAAGAATC TGTAGACCAA AGAGGAAACC 

ACCGGGAGAG GAGTAGACGA TGTTTCTTAG ACATCTGGTT TCTCCTTTGG 
1751 AGATGATGTC AGACAAGAGA AACGTCATCC TGTTTTCTGT ATTCGATGAG 

TCTACTACAG TCTGTTCTCT TTGCAGTAGG ACAAAAGACA TAAGCTACTC 
1801 AATCAAAGCT GGTACCTCGC AGAGAATATT CAGCGCTTCC TCCCCAATCC 

TTAGTTTCGA CCATGGAGCG TCTCTTATAA GTCGCGAAGG AGGGGTTAGG 
1851 GGATGGATTA CAGCCCCAGG ATCCAGAGTT CCAAGCTTCT AACATCATGC 

CCTACCTAAT GTCGGGGTCC TAGGTCTCAA GGTTCGAAGA TTGTAGTACG 
1901 ACAGCATCAA TGGCTATGTT TTTGATAGCT TGCAGCTGTC GGTTTGTTTG 

TGTCGTAGTT ACCGATACAA AAACTATCGA ACGTCGACAG CCAAACAAAC 
1951 CACGAGGTGG CATACTGGTA CATTCTAAGT GTTGGAGCAC AGACGGACTT 

GTGCTCCACC GTATGACCAT GTAAGATTCA CAACCTCGTG TCTGCCTGAA 
2001 CCTCTCCGTC TTCTTCTCTG GCTACACCTT CAAACACAAA ATGGTCTATG 

GGAGAGGCAG AAGAAGAGAC CGATGTGGAA GTTTGTGTTT TACCAGATAC 
2051 AAGACACACT CACCCTGTTC CCCTTCTCAG GAGAAACGGT CTTCATGTCA 

TTCTGTGTGA GTGGGACAAG GGGAAGAGTC CTCTTTGCCA GAAGTACAGT 
2101 ATGGAAAACC CAGGTCTCTG GGTCCTTGGG TGCCACAACT CAGACTTGCG 

TACCTTTTGG GTCCAGAGAC CCAGGAACCC ACGGTGTTGA GTCTGAACGC 
2151 GAACAGAGGG ATGACAGCCT TACTGAAGGT GTATAGTTGT GACAGGGACA 

CTTGTCTCCC TACTGTCGGA ATGACTTCCA CATATCAACA CTGTCCCTGT 
2201 TTGGTGATTA TTATGACAAC ACTTATGAAG ATATTCCAGG CTTCTTGCTG 

AACCACTAAT AATACTGTTG TGAATACTTC TATAAGGTCC GAAGAACGAC 
2251 AGTGGAAAGA ATGTCATTGA ACCTAGGAGC TTTGCCCAGA ATTCAAGACC 

TCACCTTTCT TACAGTAACT TGGATCCTCG AAACGGGTCT TAAGTTCTGG 
2301 CCCTAGTGCG AGCGCTCCAA AGCCTCCGGT CCTGCGACGG CATCAGAGGG 

GGGATCACGC TCGCGAGGTT TCGGAGGCCA GGACGCTGCC GTAGTCTCCC 
2351 ACATAAGCCT TCCTACTTTT CAGCCGGAGG AAGACAAAAT GGACTATGAT 

TGTATTCGGA AGGATGAAAA GTCGGCCTCC. TTCTGTTTTA CCTGATACTA 
2401 GATATCTTCT CAACTGAAAC GAAGGGAGAA GATTTTGACA TTTACGGTGA 

CTATAGAAGA GTTGACTTTG CTTCCCTCTT CTAAAACTGT AAATGCCACT 
2451 GGATGAAAAT CAGGACCCTC GCAGCTTTCA GAAGAGAACC CGACACTATT 

CCTACTTTTA GTCCTGGGAG CGTCGAAAGT CTTCTCTTGG GCTGTGATAA 
2501 TCATTGCTGC GGTGGAGCAG CTCTGGGATT ACGGGATGAG CGAATCCCCC 

AGTAACGACG CCACCTCGTC GAGACCCTAA TGCCCTACTC GCTTAGGGGG 
2551 CGGGCGCTAA GAAACAGGGC TCAGAACGGA GAGGTGCCTC GGTTCAAGAA 

GCCCGCGATT CTTTGTCCCG AGTCTTGCCT CTCCACGGAG CCAAGTTCTT 
2601 GGTGGTCTTC CGGGAATTTG CTGACGGCTC CTTCACGCAG CCGTCGTACC 

CCACCAGAAG GCCCTTAAAC GACTGCCGAG GAAGTGCGTC GGCAGCATGG 
2651 GCGGGGAACT CAACAAACAC TTGGGGCTCT TGGGACCCTA CATCAGAGCG 

CGCCCCTTGA GTTGTTTGTG AACCCCGAGA ACCCTGGGAT GTAGTCTCGC 
2701 GAAGTTGAAG ACAACATCAT GGTAACTTTC AAAAACCAGG CGTCTCGTCC 

CTTCAACTTC TGTTGTAGTA CCATTGAAAG TTTTTGGTCC GCAGAGCAGG 
2751 CTATTCCTTC TACTCGAGCC TTATTTCTTA TCCGGATGAT CAGGAGCAAG 
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GATAAGGAAG ATGAGCTCGG AATAAAGAAT AGGCCTACTA GTCCTCGTTC 
2801 GGGCAGAACC TCGACACAAC TTCGTCCAGC CAAATGAAAC CAGAACTTAC 

CCCGTCTTGG AGCTGTGTTG AAGCAGGTCG GTTTACTTTG GTCTTGAATG 
2851 TTTTGGAAAG TGCAGCATCA CATGGCACCC ACAGAAGACG AGTTTGACTG 

AAAACCTTTC ACGTCGTAGT GTACCGTGGG TGTCTTCTGC TCAAACTGAC 
2901 CAAAGCCTGG GCCTACTTTT CTGATGTTGA CCTGGAAAAA GATGTGCACT 

GTTTCGGACC CGGATGAAAA GACTACAACT GGACCTTTTT CTACACGTGA 
2951 CAGGCTTGAT CGGCCCCCTT CTGATCTGCC GCGCCAACAC CCTGAACGCT 

GTCCGAACTA GCCGGGGGAA GACTAGACGG CGCGGTTGTG GGACTTGCGA 
3001 GCTCACGGTA GACAAGTGAC CGTGCAAGAA TTTGCTCTGT TTTTCACTAT 

CGAGTGCCAT CTGTTCACTG GCACGTTCTT AAACGAGACA AAAAGTGATA 
3051 TTTTGATGAG ACAAAGAGCT GGTACTTCAC TGAAAATGTG GAAAGGAACT 

AAAACTACTC TGTTTCTCGA CCATGAAGTG ACTTTTACAC CTTTCCTTGA 
3101 GCCGGGCCCC CTGCCATCTG CAGATGGAGG ACCCCACTCT GAAAGAAAAC 

CGGCCCGGGG GACGGTAGAC GTCTACCTCC TGGGGTGAGA CTTTCTTTTG 
3151 TATCGCTTCC ATGCAATCAA TGGCTATGTG ATGGATACAC TCCCTGGCTT 

ATAGCGAAGG TACGTTAGTT ACCGATACAC TACCTATGTG AGGGACCGAA 
3201 AGTAATGGCT CAGAATCAAA GGATCCGATG GTATCTGCTC AGCATGGGCA 

TCATTACCGA GTCTTAGTTT CCTAGGCTAC CATAGACGAG TCGTACCCGT 
3251 GCAATGAAAA TATCCATTCG ATTCATTTTA GCGGACACGT GTTCAGTGTA 

CGTTACTTTT ATAGGTAAGC TAAGTAAAAT CGCCTGTGCA CAAGTCACAT 
3301 CGGAAAAAGG AGGAGTATAA AATGGCCGTG TACAATCTCT ATCCGGGTGT 

GCCTTTTTCC TCCTCATATT TTACCGGCAC ATGTTAGAGA TAGGCCCACA 
3351 CTTTGAGACA GTGGAAATGC TACCGTCCAA AGTTGGAATT TGGCGAATAG 

GAAACTCTGT CACCTTTACG ATGGCAGGTT TCAACCTTAA ACCGCTTATC 
34 01 AATGCCTGAT TGGCGAGCAC CTGCAAGCTG GGATGAGCAC GACTTTCCTG 

TTACGGACTA ACCGCTCGTG GACGTTCGAC CCTACTCGTG CTGAAAGGAC 
3451 GTGTACAGCA AGAAGTGTCA GACTCCCCTG GGAATGGCTT CTGGACACAT 

CACATGTCGT TCTTCACAGT CTGAGGGGAC CCTTACCGAA GACCTGTGTA 
3501 TAGAGATTTT CAGATTACAG CTTCAGGACA ATATGGACAG TGGGCCCCAA 

ATCTCTAAAA GTCTAATGTC GAAGTCCTGT TATACCTGTC ACCCGGGGTT 
3551 AGCTGGCCAG ACTTCATTAT TCCGGATCAA TCAATGCCTG GAGCACCAAG 

TCGACCGGTC TGAAGTAATA AGGCCTAGTT AGTTACGGAC CTCGTGGTTC 
3601 GAGCCCTTTT CTTGGATCAA GGTGGATCTG TTGGCACCAA TGATTATTCA 

CTCGGGAAAA GAACCTAGTT CCACCTAGAC AACCGTGGTT ACTAATAAGT 
3651 CGGCATCAAG ACCCAGGGTG CCCGTCAGAA GTTCTCCAGC CTCTACATCT 

GCCGTAGTTC TGGGTCCCAC GGGCAGTCTT CAAGAGGTCG GAGATGTAGA 
3701 CTCAGTTTAT CATCATGTAT AGTCTTGATG GGAAGAAGTG GCAGACTTAT 

GAGTCAAATA GTAGTACATA TCAGAACTAC CCTTCTTCAC CGTCTGAATA 
3751 CGAGGAAATT CCACTGGAAC CTTAATGGTC TTCTTTGGCA ATGTGGATTC 

GCTCCTTTAA GGTGACCTTG GAATTACCAG AAGAAACCGT TACACCTAAG 
3801 ATCTGGGATA AAACACAATA TTTTTAACCC TCCAATTATT GCTCGATACA 

TAGACCCTAT TTTGTGTTAT AAAAATTGGG AGGTTAATAA CGAGCTATGT 
3851 TCCGTTTGCA CCCAACTCAT TATAGCATTC GCAGCACTCT TCGCATGGAG 

AGGCAAACGT GGGTTGAGTA ATATCGTAAG CGTCGTCAGA AGCGTACCTC 
3901 TTGATGGGCT GTGATTTAAA TAGTTGCAGC ATGCCATTGG GAATGGAGAG 

AACTACCCGA CACTAAATTT ATCAACGTCG TACGGTAACC CTTACCTCTC 
3951 TAAAGCAATA TCAGATGCAC AGATTACTGC TTCATCCTAC TTTACCAATA 

ATTTCGTTAT AGTCTACGTG TCTAATGACG AAGTAGGATG AAATGGTTAT 
4001 TGTTTGCCAC CTGGTCTCCT TCAAAAGCTC GACTTCACCT CCAAGGGAGG 

ACAAACGGTG GACCAGAGGA AGTTTTCGAG CTGAAGTGGA GGTTCCCTCC 
4051 AGTAATGCCT GGAGACCTCA GGTGAATAAT CCAAAAGAGT GGCTGCAAGT 

TCATTACGGA CCTCTGGAGT CCACTTATTA GGTTTTCTCA CCGACGTTCA 
4101 GGACTTCCAG AAGACAATGA AAGTCACAGG AGTAACTACT CAGGGAGTAA 

CCTGAAGGTC TTCTGTTACT TTCAGTGTCC TCATTGATGA GTCCCTCATT 
4151 AATCTCTGCT TACCAGCATG TATGTGAAGG AGTTCCTCAT CTCCAGCAGT 

TTAGAGACGA ATGGTCGTAC ATACACTTCC TCAAGGAGTA GAGGTCGTCA 
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4201 CAAGATGGCC ATCAGTGGAC TCTCTTTTTT CAGAATGGCA AAGTAAAGGT 

GTTCTACCGG TAGTCACCTG AGAGAAAAAA GTCTTACCGT TTCATTTCCA 
4251 TTTTCAGGGA AATCAAGACT CCTTCACACC TGTGGTGAAC TCTCTAGACC 

AAAAGTCCCT TTAGTTCTGA GGAAGTGTGG ACACCACTTG AGAGATCTGG 
4301 CACCGTTACT GACTCGCTAC CTTCGAATTC ACCCCCAGAG TTGGGTGCAC 

GTGGCAATGA CTGAGCGATG GAAGCTTAAG TGGGGGTCTC AACCCACGTG 
4351 CAGATTGCCC TGAGGATGGA GGTTCTGGGC TGCGAGGCAC AGGACCTCTA 

GTCTAACGGG ACTCCTACCT CCAAGACCCG ACGCTCCGTG TCCTGGAGAT 
4401 C 

G 



1-57 SIGNAL PEPTIDE 
58-1173 Al DOMAIN 
1174-2277 A2 DOMAIN 
2278-2349 OL LINKER 
2350-3462 ap-A3 DOMAINS 
3463-3921 CI DOMAIN 
3922-4401 C2 DOMAIN 
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AMINO ACID SEQUENCE OF HP46/SQ 

1 MQLELSTCVF LCLLPLGFSA IRRYYLGAVE LSWDYRQSEL LRELHVDTRF 

51 PATAPGALPL GPSVLYKKTV FVEFTDQLFS VARPRPPWMG LLGPTIQAEV 

101 YDTVWTLKN MASHPVSLHA VGVSFWKSSE GAEYEDHTSQ REKEDDKVLP 

151 GKSQTYVWQV LKENGPTASD PPCLTYSYLS HVDLVKDLNS GLIGALLVCR 

201 EGSLTRERTQ NLHEFVLLFA VFDEGKSWHS ARNDSWTRAM DPAPARAQPA 

251 MHTVNGYVNR SLPGLIGCHK KSVYWHVIGM GTSPEVHSIF LEGHTFLVRH 

301 HRQASLEISP LTFLTAQTFL MDLGQFLLFC HISSHHHGGM EAHVRVESCA 

351 EEPQLRRKAD EEEDYDDNLY DSDMDWRLD GDDVSPFIQI RSVAKKHPKT 

401 WVHYIAAEEE DWDYAPLVLA PDDRSYKSQY LNNGPQRIGR KYKKVRFMAY 

451 TDETFKTREA IQHESGILGP LLYGEVGDTL LIIFKNQASR PYNIYPHGIT 

501 DVRPLYSRRL PKGVKHLKDF PILPGEIFKY KWTVTVEDGP TKSDPRCLTR 

551 YYSSFVNMER DLASGLIGPL LICYKESVDQ RGNQIMSDKR NVILFSVFDE 

601 NRSWYLTENI QRFLPNPAGV QLEDPEFQAS NIMHSINGYV FDSLQLSVCL 

651 HEVAYWYILS IGAQTDFLSV FFSGYTFKHK MVYEDTLTLF PFSGETVFMS 

701 MENPGLWILG CHNSFLRNRG MTALLKVSSC DKNTGDYYED SYEDISAYLL 

751 SKNNAIEPRS FSQNPPVLKR HQREITRTTL QSDQEEIDYD DTISVEMKKE 

801 DFDIYDEDEN QSPRSFQKKT RHYFIAAVER LWDYGMSSSP HVLRNRAQSG 

851 SVPQFKKWF QEFTDGSFTQ PLYRGELNEH LGLLGPYIRA EVEDNIMVTF 

901 RNQASRPYSF YSSLISYEED QRQGAEPRKN FVKPNETKTY FWKVQHHMAP 

951 TKDEFDCKAW AYFSDVDLEK DVHSGLIGPL LVCHTNTLNP AHGRQVTVQE 

1001 FALFFTIFDE TKSWYFTENM ERNCRAPCNI QMEDPTFKEN YRFHAINGYI 

1051 MDTLPGLVMA QDQRIRWYLL SMGSNENIHS IHFSGHVFTV RKKEEYKMAL 

1101 YNLYPGVFET VEMLPSKAGI WRVECLIGEH LHAGMSTLFL VYSNKCQTPL 

1151 GMASGHIRDF QITASGQYGQ WAPKLARLHY SGSINAWSTK EPFSWIKVDL 

1201 LAPMIIHGIK TQGARQKFSS LYISQFIIMY SLDGKKWQTY RGNSTGTLMV 

1251 FFGNVDSSGI KHNIFNPPII ARYIRLHPTH YSIRSTLRME LMGCDLNSCS 

1301 MPLGMESKAI SDAQITASSY FTNMFATWSP SKARLHLQGR SNAWRPQVNN 

1351 PKEWLQVDFQ KTMKVTGVTT QGVKSLLTSM YVKEFLISSS QDGHQWTLFF 

1401 QNGKVKVFQG NQDSFTPWN SLDPPLLTRY LRIHPQSWVH QIALRMEVLG 

1451 CEAQDLY* 



1-19 SIGNAL PEPTIDE 
20-391 Al DOMAIN 
392-759 A2 DOMAIN 
760-773 SQ LINKER 
774-1144 ap-A3 
1145-1297 CI DOMAIN 
1298-1457 C2 DOMAIN 
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HP4 6/SQ NUCLEOTIDE SEQUENCE 

1 ATGCAGCTAG AGCTCTCCAC CTGTGTCTTT CTGTGTCTCT TGCCACTCGG 

TACGTCGATC TCGAGAGGTG GACACAGAAA GACACAGAGA ACGGTGAGCC 
51 CTTTAGTGCC ATCAGGAGAT ACTACCTGGG CGCAGTGGAA CTGTCCTGGG 

GAAATCACGG TAGTCCTCTA TGATGGACCC GCGTCACCTT GACAGGACCC 
101 ACTACCGGCA AAGTGAACTC CTCCGTGAGC TGCACGTGGA CACCAGATTT 

TGATGGCCGT TTCACTTGAG GAGGCACTCG ACGTGCACCT GTGGTCTAAA 
151 CCTGCTACAG CGCCAGGAGC TCTTCCGTTG GGCCCGTCAG TCCTGTACAA 

GGACGATGTC GCGGTCCTCG AGAAGGCAAC CCGGGCAGTC AGGACATGTT 
201 AAAGACTGTG TTCGTAGAGT TCACGGATCA ACTTTTCAGC GTTGCCAGGC 

TTTCTGACAC AAGCATCTCA AGTGCCTAGT TGAAAAGTCG CAACGGTCCG 
251 CCAGGCCACC ATGGATGGGT CTGCTGGGTC CTACCATCCA GGCTGAGGTT 

GGTCCGGTGG TACCTACCCA GACGACCCAG GATGGTAGGT CCGACTCCAA 
301 TACGACACGG TGGTCGTTAC CCTGAAGAAC ATGGCTTCTC ATCCCGTTAG 

ATGCTGTGCC ACCAGCAATG GGACTTCTTG TACCGAAGAG TAGGGCAATC 
351 TCTTCACGCT GTCGGCGTCT CCTTCTGGAA ATCTTCCGAA GGCGCTGAAT 

AGAAGTGCGA CAGCCGCAGA GGAAGACCTT TAGAAGGCTT CCGCGACTTA 
401 ATGAGGATCA CACCAGCCAA AGGGAGAAGG AAGACGATAA AGTCCTTCCC 

TACTCCTAGT GTGGTCGGTT TCCCTCTTCC TTCTGCTATT TCAGGAAGGG 
451 GGTAAAAGCC AAACCTACGT CTGGCAGGTC CTGAAAGAAA ATGGTCCAAC 

CCATTTTCGG TTTGGATGCA GACCGTCCAG GACTTTCTTT TACCAGGTTG 
501 AGCCTCTGAC CCACCATGTC TTACCTACTC ATACCTGTCT CACGTGGACC 

TCGGAGACTG GGTGGTACAG AATGGATGAG TATGGACAGA GTGCACCTGG 
551 TGGTGAAAGA CCTGAATTCG GGCCTCATTG GAGCCCTGCT GGTTTGTAGA 

ACCACTTTCT GGACTTAAGC CCGGAGTAAC CTCGGGACGA CCAAACATCT 
601 GAAGGGAGTC TGACCAGAGA AAGGACCCAG AACCTGCACG AATTTGTACT 

CTTCCCTCAG ACTGGTCTCT TTCCTGGGTC TTGGACGTGC TTAAACATGA 
651 ACTTTTTGCT GTCTTTGATG AAGGGAAAAG TTGGCACTCA GCAAGAAATG 

TGAAAAACGA CAGAAACTAC TTCCCTTTTC AACCGTGAGT CGTTCTTTAC 
701 ACTCCTGGAC ACGGGCCATG GATCCCGCAC CTGCCAGGGC CCAGCCTGCA 

TGAGGACCTG TGCCCGGTAC CTAGGGCGTG GACGGTCCCG GGTCGGACGT 
751 ATGCACACAG TCAATGGCTA TGTCAACAGG TCTCTGCCAG GTCTGATCGG 

TACGTGTGTC AGTTACCGAT ACAGTTGTCC AGAGACGGTC CAGACTAGCC 
801 ATGTCATAAG AAATCAGTCT ACTGGCACGT GATTGGAATG GGCACCAGCC 

TACAGTATTC TTTAGTCAGA TGACCGTGCA CTAACCTTAC CCGTGGTCGG 
851 CGGAAGTGCA CTCCATTTTT CTTGAAGGCC ACACGTTTCT CGTGAGGCAC 

GCCTTCACGT GAGGTAAAAA GAACTTCCGG TGTGCAAAGA GCACTCCGTG 
901 CATCGCCAGG CTTCCTTGGA GATCTCGCCA CTAACTTTCC TCACTGCTCA 

GTAGCGGTCC GAAGGAACCT CTAGAGCGGT GATTGAAAGG AGTGACGAGT 
951 GACATTCCTG ATGGACCTTG GCCAGTTCCT ACTGTTTTGT CATATCTCTT 

CTGTAAGGAC TACCTGGAAC CGGTCAAGGA TGACAAAACA GTATAGAGAA 
1001 CCCACCACCA TGGTGGCATG GAGGCTCACG TCAGAGTAGA AAGCTGCGCC 

GGGTGGTGGT ACCACCGTAC CTCCGAGTGC AGTCTCATCT TTCGACGCGG 
1051 GAGGAGCCCC AGCTGCGGAG GAAAGCTGAT GAAGAGGAAG ATTATGATGA 

CTCCTCGGGG TCGACGCCTC CTTTCGACTA CTTCTCCTTC TAATACTACT 
1101 CAATTTGTAC GACTCGGACA TGGACGTGGT CCGGCTCGAT GGTGACGACG 

GTTAAACATG CTGAGCCTGT ACCTGCACCA GGCCGAGCTA CCACTGCTGC 
1151 TGTCTCCCTT TATCCAAATC CGCTCAGTTG CCAAGAAGCA TCCTAAAACT 

ACAGAGGGAA ATAGGTTTAG GCGAGTCAAC GGTTCTTCGT AGGATTTTGA 
1201 TGGGTACATT ACATTGCTGC TGAAGAGGAG GACTGGGACT ATGCTCCCTT 
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ACCCATGTAA TGTAACGACG ACTTCTCCTC CTGACCCTGA TACGAGGGAA 
1251 AGTCCTCGCC CCCGATGACA GAAGTTATAA AAGTCAATAT TTGAACAATG 

TCAGGAGCGG GGGCTACTGT CTTCAATATT TTCAGTTATA AACTTGTTAC 
1301 GCCCTCAGCG GATTGGTAGG AAGTACAAAA AAGTCCGATT TATGGCATAC 

CGGGAGTCGC CTAACCATCC TTCATGTTTT TTCAGGCTAA ATACCGTATG 
1351 ACAGATGAAA CCTTTAAGAC GCGTGAAGCT ATTCAGCATG AATCAGGAAT 

TGTCTACTTT GGAAATTCTG CGCACTTCGA TAAGTCGTAC TTAGTCCTTA 
1401 CTTGGGACCT TTACTTTATG GGGAAGTTGG AGACACACTG TTGATTATAT 

GAACCCTGGA AATGAAATAC CCCTTCAACC TCTGTGTGAC AACTAATATA 
1451 TTAAGAATCA AGCAAGCAGA CCATATAACA TCTACCCTCA CGGAATCACT 

AATTCTTAGT TCGTTCGTCT GGTATATTGT AGATGGGAGT GCCTTAGTGA 
1501 GATGTCCGTC CTTTGTATTC AAGGAGATTA CCAAAAGGTG TAAAACATTT 

CTACAGGCAG GAAACATAAG TTCCTCTAAT GGTTTTCCAC ATTTTGTAAA 
1551 GAAGGATTTT CCAATTCTGC CAGGAGAAAT ATTCAAATAT AAATGGACAG 

CTTCCTAAAA GGTTAAGACG GTCCTCTTTA TAAGTTTATA TTTACCTGTC 
1601 TGACTGTAGA AGATGGGCCA ACTAAATCAG ATCCGCGGTG CCTGACCCGC 

ACTGACATCT TCTACCCGGT TGATTTAGTC TAGGCGCCAC GGACTGGGCG 
1651 TATTACTCTA GTTTCGTTAA TATGGAGAGA GATCTAGCTT CAGGACTCAT 

ATAATGAGAT CAAAGCAATT ATACCTCTCT CTAGATCGAA GTCCTGAGTA 
1701 TGGCCCTCTC CTCATCTGCT ACAAAGAATC TGTAGTCCAA AGAGGAAACC 

ACCGGGAGAG GAGTAGACGA TGTTTCTTAG ACATCTAGTT TCTCCTTTGG 
1751 AGATAATGTC AGACAAGAGG AATGTCATCC TGTTTTCTGT ATTTGATGAG 

TCTATTACAG TCTGTTCTCC TTACAGTAGG ACAAAAGACA TAAACTACTC 
1801 AACCGAAGCT GGTACCTCAC AGAGAATATA CAACGCTTTC TCCCCAATCC 

TTGGCTTCGA CCATGGAGTG TCTCTTATAT GTTGCGAAAG AGGGGTTAGG 
1851 AGCTGGAGTA CAGCTTGAGG ATCCAGAGTT CCAAGCCTCC AACATCATGC 

TCGACCTCAC GTCGAACTCC TAGGTCTCAA GGTTCGAAGG TTGTAGTACG 
1901 ACAGCATCAA TGGCTATGTT TTTGATAGTT TGCAGTTGTC AGTTTGTTTG 

TGTCGTAGTT ACCGATACAA AAACTATCAA ACGTCAACAG TCAAACAAAC 
1951 CATGAGGTGG CATACTGGTA CATTCTAAGC ATTGGAGCAC AGACTGACTT 

GTACTCCACC GTATGACCAT GTAAGATTCG TAACCTCGTG TCTGACTGAA 
2001 CCTTTCTGTC TTCTTCTCTG GATATACCTT CAAACACAAA ATGGTCTATG 

GGAAAGACAG AAGAAGAGAC CTATATGGAA GTTTGTGTTT TACCAGATAC 
2051 AAGACACACT CACCCTATTC CCATTCTCAG GAGAAACTGT CTTCATGTCG 

TTCTGTGTGA GTGGGATAAG GGTAAGAGTC CTCTTTGACA GAAGTACAGC 
2101 ATGGAAAACC CAGGTCTATG GATTCTGGGG TGCCACAACT CAGACTTTCG 

TACCTTTTGG GTCCAGATAC CTAAGACCCC ACGGTGTTGA GTCTGAAAGC 
2151 GAACAGAGGC ATGACCGCCT TACTGAAGGT TTCTAGTTGT GACAAGAACA 

CTTGTCTCCG TACTGGCGGA ATGACTTCCA AAGATCAACA CTGTTCTTGT 
2201 CTGGTGATTA TTACGAGGAC AGTTATGAAG ATATTTCAGC ATACTTGCTG 

GACCACTAAT AATGCTCCTG TCAATACTTC TATAAAGTCG TATGAACGAC 
2251 AGTAAAAACA ATGCCATTGA ACCTAGGAGC TTCTCTCAGA ATCCACCAGT 

TCATTTTTGT TACGGTAACT TGGATCCTCG AAGAGAGTCT TAGGTGGTCA 
' 2301 CTTGAAACGC CATCAACGGG AAATAACTCG TACTACTCTT CAGTCAGATC 

GAACTTTGCG GTAGTTGCCC TTTATTGAGC ATGATGAGAA GTCAGTCTAG 
2351 AAGAGGAAAT TGACTATGAT GATACCATAT CAGTTGAAAT GAAGAAGGAA 

TTCTCCTTTA ACTGATACTA CTATGGTATA GTCAACTTTA CTTCTTCCTT 
2401 GATTTTGACA TTTATGATGA GGATGAAAAT CAGAGCCCCC GCAGCTTTCA 

CTAAAACTGT AAATACTACT CCTACTTTTA GTCTCGGGGG CGTCGAAAGT 
2451 AAAGAAAACA CGACACTATT TTATTGCTGC AGTGGAGAGG CTCTGGGATT 

TTTCTTTTGT GCTGTGATAA AATAACGACG TCACCTCTCC GAGACCCTAA 
2501 ATGGGATGAG TAGCTCCCCA CATGTTCTAA GAAACAGGGC TCAGAGTGGC 
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TACCCTACTC ATCGAGGGGT GTACAAGATT CTTTGTCCCG AGTCTCAGCG 
2551 AGTGTCCCTC AGTTCAAGAA AGTTGTTTTC CAGGAATTTA CTGATGGCTC 

TCACAGGGAG TCAAGTTCTT TCAACAAAAG GTCCTTAAAT GACTACCGAG 
2601 CTTTACTCAG CCCTTATACC GTGGAGAACT AAATGAACAT TTGGGACTCC 

GAAATGAGTC GGGAATATGG CACCTCTTGA TTTACTTGTA AACCCTGAGG 
2651 TGGGGCCATA TATAAGAGCA GAAGTTGAAG ATAATATCAT GGTAACTTTC 

ACCCCGGTAT ATATTCTCGT CTTCAACTTC TATTATAGTA CCATTGAAAG 
2701 AGAAATCAGG CCTCTCGTCC CTATTCCTTC TATTCTAGCC TTATTTCTTA 

TCTTTAGTCC GGAGAGCAGG GATAAGGAAG ATAAGATCGG AATAAAGAAT 
2751 TGAGGAAGAT CAGAGGCAAG GAGCAGAACC TAGAAAAAAC TTTGTCAAGC 

ACTCCTTCTA GTCTCCGTTC CTCGTCTTGG ATCTTTTTTG AAACAGTTCG 
2801 CTAATGAAAC CAAAACTTAC TTTTGGAAAG TGCAACATCA TATGGCACCC 

GATTACTTTG GTTTTGAATG AAAACCTTTC ACGTTGTAGT ATACCGTGGG 
2851 ACTAAAGATG AGTTTGACTG CAAAGCCTGG GCTTATTTCT CTGATGTTGA 

TGATTTCTAC TCAAACTGAC GTTTCGGACC CGAATAAAGA GACTACAACT 
2901 CCTGGAAAAA GATGTGCACT CAGGCCTGAT TGGACCCCTT CTGGTCTGCC 

GGACCTTTTT CTACACGTGA GTCCGGACTA ACCTGGGGAA GACCAGACGG 
2951 ACACTAACAC ACTGAACCCT GCTCATGGGA GACAAGTGAC AGTACAGGAA 

TGTGATTGTG TGACTTGGGA CGAGTACCCT CTGTTCACTG TCATGTCATT 
3001 TTTGCTCTGT TTTTCACCAT CTTTGATGAG ACCAAAAGCT GGTACTTCAC 

AAACGAGACA AAAAGTGGTA GAAACTACTC TGGTTTTCGA CCATGAAGTG 
3051 TGAAAATATG GAAAGAAACT GCAGGGCTCC CTGCAATATC CAGATGGAAG 

ACTTTTATAC CTTTCTTTGA CGTCCCGAGG GACGTTATAG GTCTACCTTC 
3101 ATCCCACTTT TAAAGAGAAT TATCGCTTCC ATGCAATCAA TGGCTACATA 

TAGGGTGAAA ATTTCTCTTA ATAGCGAAGG TACGTTAGTT ACCGATGTAT 
3151 ATGGATACAC TACCTGGCTT AGTAATGGCT CAGGATCAAA GGATTCGATG 

TACCTATGTG ATGGACCGAA TCATTACCGA GTCCTAGTTT CCTAAGCTAC 
3201 GTATCTGCTC AGCATGGGCA GCAATGAAAA CATCCATTCT ATTCATTTCA 

CATAGACGAG TCGTACCCGT CGTTACTTTT GTAGGTAAGA TAAGTAAAGT 
3251 GTGGACATGT GTTCACTGTA CGAAAAAAAG AGGAGTATAA AATGGCACTG 

CACCTGTACA CAAGTGACAT GCTTTTTTTC TCCTCATATT TTACCGTGAC 
3301 TACAATCTCT ATCCAGGTGT TTTTGAGACA GTGGAAATGT TACCATCCAA 

ATGTTAGAGA TAGGTCCACA AAAACTCTGT CACCTTTACA ATGGTAGGTT 
3351 AGCTGGAATT TGGCGGGTGG AATGCCTTAT TGGCGAGCAT CTACATGCTG 

TCGACCTTAA ACCGCCCACC TTACGGAATA ACCGCTCGTA GATGTACGAC 
3401 GGATGAGCAC ACTTTTTCTG GTGTACAGCA ATAAGTGTCA GACTCCCCTG 

CCTACTCGTG TGAAAAAGAC CACATGTCGT TATTCACAGT CTGAGGGGAC 
3451 GGAATGGCTT CTGGACACAT TAGAGATTTT CAGATTACAG CTTCAGGACA 

CCTTACCGAA GACCTGTGTA ATCTCTAAAA GTCTAATGTC GAAGTCCTGT 
3501 ATATGGACAG TGGGCCCCAA AGCTGGCCAG ACTTCATTAT TCCGGATCAA 

TATACCTGTC ACCCGGGGTT TCGACCGGTC TGAAGTAATA AGGCCTAGTT 
3551 TCAATGCCTG GAGCACCAAG GAGCCCTTTT CTTGGATCAA GGTGGATCTG 

AGTTACGGAC CTCGTGGTTC CTCGGGAAAA GAACCTAGTT CCACCTAGAC 
3601 TTGGCACCAA TGATTATTCA CGGCATCAAG ACCCAGGGTG CCCGTCAGAA 

AACCGTGGTT GCCGTAGTTC GCCGTAGTTC TGGGTCCCAC GGGCAGTCTT 
3651 GTTCTCCAGC CTCTACATCT CTCAGTTTAT CATCATGTAT AGTCTTGATG 

CAAGAGGTCG GAGATGTAGA GAGTCAAATA GTAGTACATA TCAGAACTAC 
3701 GGAAGAAGTG GCAGACTTAT CGAGGAAATT CCACTGGAAC CTTAATGGTC 

CCTTCTTCAC CGTCTGAATA GCTCCTTTAA GGTGACCTTG GAATTACCAG 
3751 TTCTTTGGCA ATGTGGATTC ATCTGGGATA AAACACAATA TTTTTAACCC 

AAGAAACCGT TACACCTAAG TAGACCCTAT TTTGTGTTAT AAAAATTGGG 
3801 TCCAATTATT GCTCGATACA TCCGTTTGCA CCCAACTCAT TATAGCATTC 
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AGGTTAATAA CGAGCTATGT AGGCAAAGCT GGGTTGAGTA ATATCGTAAG 
3851 GCAGCACTCT TCGCATGGAG TTGATGGGCT GTGATTTAAA TAGTTGCAGC 

CGTCGTGAGA AGCGTACCTC AACTACCCGA CACTAAATTT ATCAACGTCG 
3901 ATGCCATTGG GAATGGAGAG TAAAGCAATA TCAGATGCAC AGATTACTGC 

TACGGTAACC CTTACCTCTC ATTTCGTTAT AGTCTACGTG TCTAATGACG 
3951 TTCATCCTAC TTTACCAATA TGTTTGCCAC CTGGTCTCCT TCAAAAGCTC 

AAGTAGGATG AAATGGTTAT ACAAACGGTG GACCAGAGGA AGTTTTCGAG 
4001 GACTTCACCT CCAAGGGAGG AGTAATGCCT GGAGACCTCA GGTGAATAAT 

CTGAAGTGGA GGTTCCCTCC TCATTACGGA CCTCTGGAGT CCACTTATTA 
4051 CCAAAAGAGT GGCTGCAAGT GGACTTCCAG AAGACAATGA AAGTCACAGG 

GGTTTTCTCA CCGACGTTCA CCTGAAGGTC TTCTGTTACT TTCAGTGTCC 
4101 AGTAACTACT CAGGGAGTAA AATCTCTGCT TACCAGCATG TATGTGAAGG 

TCATTGATGA GTCCCTCATT TTAGAGACGA ATGGTCGTAC ATACACTTCC 
4151 AGTTCCTCAT CTCCAGCAGT CAAGATGGCC ATCAGTGGAC TCTCTTTTTT 

TCAAGGAGTA GAGGTCGTCA GTTCTACCGG TAGTCACCTG AGAGAAAAAA 
4201 CAGAATGGCA AAGTAAAGGT TTTTCAGGGA AATCAAGACT CCTTCACACC 

GTCTTACCGT TTCATTTCCA AAAACTCCCT TTAGTTCTGA GGAAGTGTGG 
4251 TGTGGTGAAC TCTCTAGACC CACCGTTACT GACTCGCTAC CTTCGAATTC 

ACACCACTTG AGAGATCTGG GTGGCAATGA CTGAGCGATG GAAGCTTAAG 
4301 ACCCCCAGAG TTGGGTGCAC CAGATTGCCC TGAGGATGGA GGTTCTGGGC 

TGGGGGTCTC AACCCACGTG GTCTAACGGG ACTCCTACCT CCAAGACCCG 
4351 TGCGAGGCAC AGGACCTCTA C 

ACGCTCCGTG TCCTGGAGAT G 



1-57 SIGNAL PEPTIDE 
58-1173 Al DOMAIN 
1174-2277 A2 DOMAIN 
2278-2319 SQ LINKER 
2320-3432 ap-A3 DOMAINS 
3433-3891 CI DOMAIN 
3892-4371 C2 DOMAIN 
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AMINO ACID SEQUENCE OF HP47/OL 

1 MQLELSTCVF LCLLPLGFSA IRRYYLGAVE LSWDYRQSEL LRELHVDTRF 

51 PATAPGALPL GPSVLYKKTV FVEFTDQLFS VARPRPPWMG LLGPTIQAEV 

101 YDTVWTLKN MASHPVSLHA VGVSFWKSSE GAEYEDHTSQ REKEDDKVLP 

151 GKSQTYVWQV LKENGPTASD PPCLTYSYLS HVDLVKDLNS GLIGALLVCR 

201 EGSLTRERTQ NLHEFVLLFA VFDEGKSWHS ARNDSWTRAM DPAPARAQPA 

251 MHTVNGYVNR SLPGLIGCHK KSVYWHVIGM GTSPEVHSIF LEGHTFLVRH 

301 HRQASLEISP LTFLTAQTFL MDLGQFLLFC HISSHHHGGM EAHVRVESCA 

351 EEPQLRRKAD EEEDYDDNLY DSDMDWRLD GDDVSPFIQI RSVAKKHPKT 

401 WVHYIAAEEE DWDYAPLVLA PDDRSYKSQY LNNGPQRIGR KYKKVRFMAY 

451 TDETFKTREA IQHESGILGP LLYGEVGDTL LIIFKNQASR PYNIYPHGIT 

501 DVRPLYSRRL PKGVKHLKDF PILPGEIFKY KWTVTVEDGP TKSDPRCLTR 

551 YYSSFVNMER DLASGLIGPL LICYKESVDQ RGNQIMSDKR NVILFSVFDE 

601 NRSWYLTENI QRFLPNPAGV QLEDPEFQAS NIMHSINGYV FDSLQLSVCL 

651 HEVAYWYILS IGAQTDFLSV FFSGYTFKHK MVYEDTLTLF PFSGETVFMS 

701 MENPGLWILG CHNSDFRNRG MTALLKVSSC DKNTGDYYED SYEDISAYLL 

751 SKNNAIEPRS FAQNSRPPSA SAPKPPVLRR HQRDISLPTF QPEEDKMDYD 

801 DIFSTETKGE DFDIYGEDEN QDPRSFQKRT RHYFIAAVEQ LWDYGMSESP 

851 RALRNRAQNG EVPRFKKWF REFADGSFTQ PSYRGELNKH LGLLGPYIRA 

901 EVEDNIMVTF KNQASRPYSF YSSLISYPDD QEQGAEPRHN FVQPNETRTY 

951 FWKVQHHMAP TEDEFDCKAW AYFSDVDLEK DVHSGLIGPL LICRANTLNA 

1001 AHGRQVTVQE FALFFTIFDE TKSWYFTENV ERNCRAPCHL QMEDPTLKEN 

1051 YRFHAINGYV MDTLPGLVMA QNQRIRWYLL SMGSNENIHS IHFSGHVFSV 

1101 RKKEEYKMAV YNLYPGVFET VEMLPSKVGI WRIECLIGEH LQAGMSTTFL 

1151 VYSKKCQTPL GMASGHIRDF QITASGQYGQ WAPKLARLHY SGSINAWSTK 

1201 EPFSWIKVDL LAPMIIHGIK TQGARQKFSS LYISQFIIMY SLDGKKWQTY 

1251 RGNSTGTLMV FFGNVDSSGI KHNIFNPPII ARYIRLHPTH YSIRSTLRME 

1301 LMGCDLNSCS MPLGMESKAI SDAQITASSY FTNMFATWSP SKARLHLQGR 

1351 SNAWRPQVNN PKEWLQVDFQ KTMKVTGVTT QGVKSLLTSM YVKEFLISSS 

1401 QDGHQWTLFF QNGKVKVFQG NQDSFTPWN SLDPPLLTRY LRIHPQSWVH 

1451 QIALRMEVLG CEAQDLY* 



1-19 SIGNAL PEPTIDE 
20-391 Al DOMAIN 
392-759 A2 DOMAIN 
760-783 OL LINKER 
784-1154 ap-A3 
1155-1307 CI DOMAIN 
1308-1467 C2 DOMAIN 
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HP47/0L NUCLEOTIDE SEQUENCE 

1 ATGCAGCTAG AGCTCTCCAC CTGTGTCTTT CTGTGTCTCT TGCCACTCGG 

TACGTCGATC TCGAGAGGTG GACACAGAAA GACACAGAGA ACGGTGAGCC 
51 CTTTAGTGCC ATCAGGAGAT ACTACCTGGG CGCAGTGGAA CTGTCCTGGG 

GAAATCACGG TAGTCCTCTA TGATGGACCC GCGTCACCTT GACAGGACCC 
101 ACTACCGGCA AAGTGAACTC CTCCGTGAGC TGCACGTGGA CACCAGATTT 

TGATGGCCGT TTCACTTGAG GAGGCACTCG ACGTGCACCT GTGGTCTAAA 
151 CCTGCTACAG CGCCAGGAGC TCTTCCGTTG GGCCCGTCAG TCCTGTACAA 

GGACGATGTC GCGGTCCTCG AGAAGGCAAC CCGGGCAGTC AGGACATGTT 
201 AAAGACTGTG TTCGTAGAGT TCACGGATCA ACTTTTCAGC GTTGCCAGGC 

TTTCTGACAC AAGCATCTCA AGTGCCTAGT TGAAAAGTCG CAACGGTCCG 
251 CCAGGCCACC ATGGATGGGT CTGCTGGGTC CTACCATCCA GGCTGAGGTT 

GGTCCGGTGG TACCTACCCA GACGACCCAG GATGGTAGGT CCGACTCCAA 
301 TACGACACGG TGGTCGTTAC CCTGAAGAAC ATGGCTTCTC ATCCCGTTAG 

ATGCTGTGCC ACCAGCAATG GGACTTCTTG TACCGAAGAG TAGGGCAATC 
351 TCTTCACGCT GTCGGCGTCT CCTTCTGGAA ATCTTCCGAA GGCGCTGAAT 

AGAAGTGCGA CAGCCGCAGA GGAAGACCTT TAGAAGGCTT CCGCGACTTA 
401 ATGAGGATCA CACCAGCCAA AGGGAGAAGG AAGACGATAA AGTCCTTCCC 

TACTCCTAGT GTGGTCGGTT TCCCTCTTCC TTCTGCTATT TCAGGAAGGG 
451 GGTAAAAGCC AAACCTACGT CTGGCAGGTC CTGAAAGAAA ATGGTCCAAC 

CCATTTTCGG TTTGGATGCA GACCGTCCAG GACTTTCTTT TACCAGGTTG 
501 AGCCTCTGAC CCACCATGTC TTACCTACTC ATACCTGTCT CACGTGGACC 

TCGGAGACTG GGTGGTACAG AATGGATGAG TATGGACAGA GTGCACCTGG 
551 TGGTGAAAGA CCTGAATTCG GGCCTCATTG GAGCCCTGCT GGTTTGTAGA 

ACCACTTTCT GGACTTAAGC CCGGAGTAAC CTCGGGACGA CCAAACATCT 
601 GAAGGGAGTC TGACCAGAGA AAGGACCCAG AACCTGCACG AATTTGTACT 

CTTCCCTCAG ACTGGTCTCT TTCCTGGGTC TTGGACGTGC TTAAACATGA 
651 ACTTTTTGCT GTCTTTGATG AAGGGAAAAG TTGGCACTCA GCAAGAAATG 

TGAAAAACGA CAGAAACTAC TTCCCTTTTC AACCGTGAGT CGTTCTTTAC 
701 ACTCCTGGAC ACGGGCCATG GATCCCGCAC CTGCCAGGGC CCAGCCTGCA 

TGAGGACCTG TGCCCGGTAC CTAGGGCGTG GACGGTCCCG GGTCGGACGT 
751 ATGCACACAG TCAATGGCTA TGTCAACAGG TCTCTGCCAG GTCTGATCGG 

TACGTGTGTC AGTTACCGAT ACAGTTGTCC AGAGACGGTC CAGACTAGCC 
801 ATGTCATAAG AAATCAGTCT ACTGGCACGT GATTGGAATG GGCACCAGCC 

TACAGTATTC TTTAGTCAGA TGACCGTGCA CTAACCTTAC CCGTGGTCGG 
851 CGGAAGTGCA CTCCATTTTT CTTGAAGGCC ACACGTTTCT CGTGAGGCAC 

GCCTTCACGT GAGGTAAAAA GAACTTCCGG TGTGCAAAGA GCACTCCGTG 
901 CATCGCCAGG CTTCCTTGGA GATCTCGCCA CTAACTTTCC TCACTGCTCA 

GTAGCGGTCC GAAGGAACCT CTAGAGCGGT GATTGAAAGG AGTGACGAGT 
951 GACATTCCTG ATGGACCTTG GCCAGTTCCT ACTGTTTTGT CATATCTCTT 

CTGTAAGGAC TACCTGGAAC CGGTCAAGGA TGACAAAACA GTATAGAGAA 
1001 CCCACCACCA TGGTGGCATG GAGGCTCACG TCAGAGTAGA AAGCTGCGCC 

GGGTGGTGGT ACCACCGTAC CTCCGAGTGC AGTCTCATCT TTCGACGCGG 
1051 GAGGAGCCCC AGCTGCGGAG GAAAGCTGAT GAAGAGGAAG ATTATGATGA 

CTCCTCGGGG TCGACGCCTC CTTTCGACTA CTTCTCCTTC TAATACTACT 
1101 CAATTTGTAC GACTCGGACA TGGACGTGGT CCGGCTCGAT GGTGACGACG 

GTTAAACATG CTGAGCCTGT ACCTGCACCA GGCCGAGCTA CCACTGCTGC 
1151 TGTCTCCCTT TATCCAAATC CGCTCGGTTG CCAAGAAGCA TCCTAAAACT 

ACAGAGGGAA ATAGGTTTAG GCGAGCCAAC GGTTCTTCGT AGGATTTTGA 
1201 TGGGTACATT ACATTGCTGC TGAAGAGGAG GACTGGGACT ATGCTCCCTT 
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ACCCATGTAA TGTAACGACG ACTTCTCCTC CTGACCCTGA TACGAGGGAA 
1251 AGTCCTCGCC CCCGATGACA GAAGTTATAA AAGTCAATAT TTGAACAATG 

TCAGGAGCGG GGGCTACTGT CTTCAATATT TTCAGTTATA AACTTGTTAC 
1301 GCCCTCAGCG GATTGGTAGG AAGTACAAAA AAGTCCGATT TATGGCATAC 

CGGGAGTCGC CTAACCATCC TTCATGTTTT TTCAGGCTAA ATACCGTATG 
1351 ACAGATGAAA CCTTTAAGAC GCGTGAAGCT ATTCAGCATG AATCAGGAAT 

TGTCTACTTT GGAAATTCTG CGCACTTCGA TAAGTCGTAC TTAGTCCTTA 
1401 CTTGGGACCT TTACTTTATG GGGAAGTTGG AGACACACTG TTGATTATAT 

GAACCCTGGA AATGAAATAC CCCTTCAACC TCTGTGTGAC AACTAATATA 
1451 TTAAGAATCA AGCAAGCAGA CCATATAACA TCTACCCTCA CGGAATCACT 

AATTCTTAGT TCGTTCGTCT GGTATATTGT AGATGGGAGT GCCTTAGTGA 
1501 GATGTCCGTC CTTTGTATTC AAGGAGATTA CCAAAAGGTG TAAAACATTT 

CTACAGGCAG GAAACATAAG TTCCTCTAAT GGTTTTCCAC ATTTTGTAAA 
1551 GAAGGATTTT CCAATTCTGC CAGGAGAAAT ATTCAAATAT AAATGGACAG 

CTTCCTAAAA GGTTAAGACG GTCCTCTTTA TAAGTTTATA TTTACCTGTC 
1601 TGACTGTAGA AGATGGGCCA ACTAAATCAG ATCCGCGGTG CCTGACCCGC 

ACTGACATCT TCTACCCGGT TGATTTAGTC TAGGCGCCAC GGACTGGGCG 
1651 TATTACTCTA GTTTCGTTAA TATGGAGAGA GATCTAGCTT CAGGACTCAT 

ATAATGAGAT CAAAGCAATT ATACCTCTCT CTAGATCGAA GTCCTGAGTA 
1701 TGGCCCTCTC CTCATCTGCT ACAAAGAATC TGTAGATCAA AGAGGAAACC 

ACCGGGAGAG GAGTAGACGA TGTTTCTTAG ACATCTAGTT TCTCCTTTGG 
1751 AGATAATGTC AGACAAGAGG AATGTCATCC TGTTTTCTGT ATTTGATGAG 

TCTATTACAG TCTGTTCTCC TTACAGTAGG ACAAAAGACA TAAACTACTC 
1801 AACCGAAGCT GGTACCTCAC AGAGAATATA CAACGCTTTC TCCCCAATCC 

TTGGCTTCGA CCATGGAGTG TCTCTTATAT GTTGCGAAAG AGGGGTTAGG 
1851 AGCTGGAGTG CAGCTTGAGG ATCCAGAGTT CCAAGCCTCC AACATCATGC 

TCGACCTCAC GTCGAACTCC TAGGTCTCAA GGTTCGGAGG TTGTAGTACG 
1901 ACAGCATCAA TGGCTATGTT TTTGATAGTT TGCAGTTGTC AGTTTGTTTG 

TGTCGTAGTT ACCGATACAA AAACTATCAA ACGTCAACAG TCAAACAAAC 
1951 CATGAGGTGG CATACTGGTA CATTCTAAGC ATTGGAGCAC AGACTGACTT 

GTACTCCACC GTATGACCAT GTAAGATTCG TAACCTCGTG TCTGACTGAA 
2001 CCTTTCTGTC TTCTTCTCTG GATATACCTT CAAACACAAA ATGGTCTATG 

GGAAAGACAG AAGAAGAGAC CTATATGGAA GTTTGTGTTT TACCAGATAC 
2051 AAGACACACT CACCCTATTC CCATTCTCAG GAGAAACTGT CTTCATGTCG 

TTCTGTGTGA GTGGGATAAG GGTAAGAGTC CTCTTTGACA GAAGTACAGC 
2101 ATGGAAAACC CAGGTCTATG GATTCTGGGG TGCCACAACT CAGACTTTCG 

TACCTTTTGG GTCCAGATAC CTAAGACCCC ACGGTGTTGA GTCTGAAAGC 
2151 GAACAGAGGC ATGACCGCCT TACTGAAGGT TTCTAGTTGT GACAAGAACA 

CTTGTCTCCG TACTGGCGGA ATGACTTCCA AAGATCAACA CTGTTCTTGT 
2201 CTGGTGATTA TTACGAGGAC AGTTATGAAG ATATTTCAGC ATACTTGCTG 

GACCACTAAT TTACGAGGAC TCAATACTTC TATAAAGTCG TATGAACGAC 
2251 AGTAAAAACA ATGCCATTGA ACCTAGGAGC TTTGCCCAGA ATTCAAGACC 

TCATTTTTGT TACGGTAACT TGGATCCTCG AAACGGGTCT TAAGTTCTGG 
2301 CCCTAGTGCG AGCGCTCCAA AGCCTCCGGT CCTGCGACGG CATCAGAGGG 

GGGATCACGC TCGCGAGGTT TCGGAGGCCA GGACGCTGCC GTAGTCTCCC 
2351 ACATAAGCCT TCCTACTTTT CAGCCGGAGG AAGACAAAAT GGACTATGAT 

TGTATTCGGA AGGATGAAAA GTCGGCCTCC TTCTGTTTTA CCTGATACTA 
2401 GATATCTTCT CAACTGAAAC GAAGGGAGAA GATTTTGACA TTTACGGTGA 

CTATAGAAGA GTTGACTTTG CTTCCCTCTT CTAAAACTGT AAATGCCACT 
2451 GGATGAAAAT CAGGACCCTC GCAGCTTTCA GAAGAGAACC CGACACTATT 

CCTACTTTTA GTCCTGGGAG CGTCGAAAGT CTTCTCTTGG GCTGTGATAA 
2501 TCATTGCTGC GGTGGAGCAG CTCTGGGATT ACGGGATGAG CGAATCCCCC 
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AGTAACGACG CCACCTCGTC GAGACCCTAA TGCCCTACTC GCTTAGGGGG 
2551 CGGGCGCTAA GAAACAGGGC TCAGAACGGA GAGGTGCCTC GGTTCAAGAA 

GCCCGCGATT CTTTGTCCCG AGTCTTGCCT CTCCACGGAG CCAAGTTCTT 
2601 GGTGGTCTTC CGGGAATTTG CTGACGGCTC CTTCACGCAG CCGTCGTACC 

CCACCAGAAG GCCCTTAAAC GACTGCCGAG GAAGTGCGTC GGCAGCATGG 
2651 GCGGGGAACT CAACAAACAC TTGGGGCTCT TGGGACCCTA CATCAGAGCG 

CGCCCCTTGA GTTGTTTGTG AACCCCGAGA ACCCTGGGAT GTAGTCTCGC 
2701 GAAGTTGAAG ACAACATCAT GGTAACTTTC AAAAACCAGG CGTCTCGTCC 

CTTCAACTTC TGTTGTAGTA CCATTGAAAG TTTTTGGTCC GCAGAGCAGG 
2751 CTATTCCTTC TACTCGAGCC TTATTTCTTA TCCGGATGAT CAGGAGCAAG 

GATAAGGAAG ATGAGCTCGG AATAAAGAAT AGGCCTACTA GTCCTCGTTC 
2801 GGGCAGAACC TCGACACAAC TTCGTCCAGC CAAATGAAAC CAGAACTTAC 

CCCGTCTTGG AGCTGTGTTG AAGCAGGTCG GTTTACTTTG GTCTTGAATG 
2851 TTTTGGAAAG TGCAGCATCA CATGGCACCC ACAGAAGACG AGTTTGACTG 

AAAACCTTTC ACGTCGTAGT GTACCGTGGG TGTCTTCTGC TCAAACTGAC 
2901 CAAAGCCTGG GCCTACTTTT CTGATGTTGA CCTGGAAAAA GATGTGCACT 

GTTTCGGACC CGGATGAAAA GACTACAACT GGACCTTTTT CTACACGTGA 
2951 CAGGCTTGAT CGGCCCCCTT CTGATCTGCC GCGCCAACAC CCTGAACGCT 

GTCCGAACTA GCCGGGGGAA GACTAGACGG CGCGGTTGTG GGACTTGCGA 
3001 GCTCACGGTA GACAAGTGAC CGTGCAAGAA TTTGCTCTGT TTTTCACTAT 

CGAGTGCCAT CTGTTCACTG GCACGTTCTT AAACGAGACA AAAAGTGATA 
3051 TTTTGATGAG ACAAAGAGCT GGTACTTCAC TGAAAATGTG GAAAGGAACT 

AAAACTACTC TGTTTCTCGA CCATGAAGTG ACTTTTACAC CTTTCCTTGA 
3101 GCCGGGCCCC CTGCCATCTG CAGATGGAGG ACCCCACTCT GAAAGAAAAC 

CGGCCCGGGG GACGGTAGAC GTCTACCTCC TGGGGTGAGA CTTTCTTTTG 
3151 TATCGCTTCC ATGCAATCAA TGGCTATGTG ATGGATACAC TCCCTGGCTT 

ATAGCGAAGG TACGTTAGTT ACCGATACAC TACCTATGTG AGGGACCGAA 
3201 AGTAATGGCT CAGAATCAAA GGATCCGATG GTATCTGCTC AGCATGGGCA 

TCATTACCGA GTCTTAGTTT CCTAGGCTAC CATAGACGAG TCGTACCCGT 
3251 GCAATGAAAA TATCCATTCG ATTCATTTTA GCGGACACGT GTTCAGTGTA 

CGTTACTTTT ATAGGTAAGC TAAGTAAAAT CGCCTGTGCA CAAGTCACAT 
3301 CGGAAAAAGG AGGAGTATAA AATGGCCGTG TACAATCTCT ATCCGGGTGT 

GCCTTTTTCC TCCTCATATT TTACCGGCAC ATGTTAGAGA TAGGCCCACA 
3351 CTTTGAGACA GTGGAAATGC TACCGTCCAA AGTTGGAATT TGGCGAATAG 

GAAACTCTGT CACCTTTACG ATGGCAGGTT TCAACCTTAA ACCGCTTATC 
3401 AATGCCTGAT TGGCGAGCAC CTGCAAGCTG GGATGAGCAC GACTTTCCTG 

TTACGGACTA ACCGCTCGTG GACGTTCGAC CCTACTCGTG CTGAAAGGAC 
3451 GTGTACAGCA AGAAGTGTCA GACTCCCCTG GGAATGGCTT CTGGACACAT 

CACATGTCGT TCTTCACAGT CTGAGGGGAC CCTTACCGAA GACCTGTGTA 
3501 TAGAGATTTT CAGATTACAG CTTCAGGACA ATATGGACAG TGGGCCCCAA 

ATCTCTAAAA GTCTAATGTC GAAGTCCTGT TATACCTGTC ACCCGGGGTT 
3551 AGCTGGCCAG ACTTCATTAT TCCGGATCAA TCAATGCCTG GAGCACCAAG 

TCGACCGGTC TGAAGTAATA AGGCCTAGTT AGTTACGGAC CTCGTGGTTC 
3601 GAGCCCTTTT CTTGGATCAA GGTGGATCTG TTGGCACCAA TGATTATTCA 

CTCGGGAAAA GAACCTAGTT CCACCTAGAC AACCGTGGTT ACTAATAAGT 
3651 CGGCATCAAG ACCCAGGGTG CCCGTCAGAA GTTCTCCAGC CTCTACATCT 

GCCGTAGTTC TGGGTCCCAC GGGCAGTCTT CAAGAGGTCG GAGATGTAGA 
3701 CTCAGTTTAT CATCATGTAT AGTCTTGATG GGAAGAAGTG GCAGACTTAT 

GAGTCAAATA GTAGTACATA TCAGAACTAC CCTTCTTCAC CGTCTGAATA 
3751 CGAGGAAATT CCACTGGAAC CTTAATGGTC TTCTTTGGCA ATGTGGATTC 

GCTCCTTTAA GGTGACCTTG GAATTACCAG AAGAAACCGT TACACCTAAG 
3801 ATCTGGGATA AAACACAATA TTTTTAACCC TCCAATTATT GCTCGATACA 
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TAGACCCTAT TTTGTGTTAT AAAAATTGGG AGGTTAATAA CGAGCTATGT 
3851 TCCGTTTGCA CCCAACTCAT TATAGCATTC GCAGCACTCT TCGCATGGAG 

AGGCAAACGT GGGTTGAGTA ATATCGTAAG CGTCGTGAGA AGCGTACCTC 
3901 TTGATGGGCT GTGATTTAAA TAGTTGCAGC ATGCCATTGG GAATGGAGAG 

AACTACCCGA CACTAAATTT ATCAACGTCG TACGGTAACC CTTACCTCTC 
3951 TAAAGCAATA TCAGATGCAC AGATTACTGC TTCATCCTAC TTTACCAATA 

ATTTCGTTAT AGTCTACGTG TCTAATGACG AAGTAGGATG AAATGGTTAT 
4001 TGTTTGCCAC CTGGTCTCCT TCAAAAGCTC GACTTCACCT CCAAGGGAGG 

ACAAACGGTG GACCAGAGGA AGTTTTCGAG CTGAAGTGGA GGTTCCCTCC 
4051 AGTAATGCCT GGAGACCTCA GGTGAATAAT CCAAAAGAGT GGCTGCAAGT 

TCATTACGGA CCTCTGGAGT CCACTTATTA GGTTTTCTCA CCGACGTTCA 
4101 GGACTTCCAG AAGACAATGA AAGTCACAGG AGTAACTACT CAGGGAGTAA 

CCTGAAGGTC TTCTGTTACT TTCAGTGTCC TCATTGATGA GTCCCTCATT 
4151 AATCTCTGCT TACCAGCATG TATGTGAAGG AGTTCCTCAT CTCCAGCAGT 

TTAGAGACGA ATGGTCGTAC ATACACTTCC TCAAGGAGTA GAGGTCGTCA 
4201 CAAGATGGCC ATCAGTGGAC TCTCTTTTTT CAGAATGGCA AAGTAAAGGT 

GTTCTACCGG TAGTCACCTG AGAGAAAAAA GTCTTACCGT TTCATTTCCA 
4251 TTTTCAGGGA AATCAAGACT CCTTCACACC TGTGGTGAAC TCTCTAGACC 

AAAAGTCCCT TTAGTTCTGA GGAAGTGTGG ACACCACTTG AGAGATCTGG 
4301 CACCGTTACT GACTCGCTAC CTTCGAATTC ACCCCCAGAG TTGGGTGCAC 

GTGGCAATGA CTGAGCGATG GAAGCTTAAG TGGGGGTCTC AACCCACGTG 
4351 CAGATTGCCC TGAGGATGGA GGTTCTGGGC TGCGAGGCAC AGGACCTCTA 

GTCTAACGGG ACTCCTACCT CCAAGACCCG ACGCTCCGTG TCCTGGAGAT 
4401 C 

G 
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AMINO ACID SEQUENCE OF HUMAN B DOMAIN -DELETED FACTOR VIII (HSQ) 



Met Gin He Glu Leu 

1 5 
Cys Phe Ser Ala Thr 
20 

Trp Asp Ty^r Met Gin 

Phe Pro Pro Arq Val 
50 

Ty^r Lys Lys Thr Leu 

Ala Lys Pro Arq Pro 

85 

Ala Glu Val Tyr Asp 
100 

His Pro Val Ser Leu 
115 

Glu Gly Ala Glu Thr 
130 

Asp Lys Val Phe Pro 
145 

Lys Glu Asn Gly Pro 

165 

Tyr Leu Ser His Val 
180 

Gly Ala Leu Leu Val 
195 

Gin Thr Leu His Lys 
210 

Lyj Ser Trp His Ser 

Ala Ala Ser Ala Arg 

245 

Val Asn Arg Ser Leu 
260 

Tyr Trp His Val He 
275 

Phe Leu Glu Gly His 
290 

Leu Glu He Ser Pro 
305 

Asp Leu Gly Gin Phe 

325 

Asp Gly Met Glu Ala 
340 

Gin Leu Arg Met Lys 
355 

Leu Thr Asp Ser Glu 
370 

Pro Ser Phe He Gin 
385 

Trp Val His Tyr He 



Ser Thr Cys Phe Phe Leu Cys Leu Leu Arg Phe 

10 15 
Arg Arg Tyr Tyr Leu Gly Ala Val Glu Leu Ser 

~ " 25 " 30 

Ser Asp Leu Gly Glu Leu Pro Val Asp Ala Arg 

40 45 
Pro Lys Ser Phe Pro Phe Asn Thr Ser Val Val 

55 60 
Phe Val Glu Phe Thr Val His Leu Phe Asn He 
70 75 80 

Pro Trp Met Gly Leu Leu Gly Pro Thr He Gin 

90 95 
Tyr Val Val He Thr Leu Lys Asn Met Ala Ser 

105 110 
His Ala Val Gly Val Ser Tyr Trp Lys Ala Ser 

120 125 
Asp Asp Gin Thr Ser Gin Arg Glu Lys Glu Asp 

135 140 
Gly Gly Ser His Thr Tyr Val Trp Gin Val Leu 
150 155 160 

Met Ala Ser Asp Pro Leu Cys Leu Thr Tyr Ser 

170 175 
Asp Leu Val Lys Asp Leu Asn Ser Gly Leu He 

185 190 
Cys Arg Glu Gly Ser Leu Ala Lys Glu Lys Thr 

200 205 
Phe He Leu Leu Phe Ala Val Phe Asp Glu Gly 

215 220 
Glu Thr Lys Asn Ser Leu Met Gin Asp Arg Asp 
230 235 24u 

Ala Trp Pro Lys Met His Thr Val Asn Gly Tyr 

250 255 
Pro Gly Leu He Gly Cys His Arg Lys Ser Val 

265 2/0 
Gly Met Gly Thr Thr Pro Glu Val His Ser He 

280 285 
Thr Phe Leu Val Arg Asn His Arg Gin Ala Ser 

295 300 
He Thr Phe Leu Thr Ala Gin Thr Leu Leu Met 
310 315 320 

Leu Leu Phe Cys His He Ser Ser His Gin His 

330 335 
Tyr Val Lys Val Asp Ser Cys Pro Glu Glu Pro 

345 350 
Asn Asn Glu Glu Ala Glu Asp Tyr Asp Asp Asp 

360 365 
Met Asp Val Val Arg Phe Asp Asp Asp Asn Ser 

37B 380 
He Arg Ser Val Ala Lys Lys His Pro Lys Thr 
390 395 * 400 

Ala Ala Glu Glu Glu Asp Trp Asp Tyr Ala Pro 



FIG. 10A 



Title: Nucleic Acid and Amino Acid Sequences Encoding High-Level 
Expressor Factor VIII Polypeptides and Methods of Use 
Inventor(s): Lollar 
Application No: To be Assigned 
Arty Dkt No: 007157/276516 



Leu Val 

Asn Gly 

Ala Tyr 
450 
Ser Gly 
465 

Leu lie 

His Gly 

Gly Val 

Lys Tyr 
530 
Pro Arg 
545 

Asp Leu 

Ser Val 

lie Leu 

Asn lie 
610 
Pro Glu 
625 

Phe Asp 

Tyr He 

Ser Gly 

Leu Phe 
690 
Gly Leu 
705 

Met Thr 
Tyr Tyr 
Asn Asn 

Lys #8 

Glu Glu 
785 

Asp Phe 
Gin Lys 
Asp Tyr 



Ser Gl 
85 



405 

Leu Ala Pro 

420 
Pro Gin Arq 
435 

Thr Asp Glu 

He Leu Gly 

He Phe Lys 
485 

He Thr Asp 
500 

Lys His Leu 
515 

Lys Trp Thr 

Cys Leu Thr 

Ala Ser Gly 
565 

Asp Gin Arq 

580 
Phe Ser Val 
595 

Gin Arg Phe 

Phe Gin Ala 

Ser Leu Gin 
645 

Leu Ser He 

660 
Tyr Thr Phe 
675 

Pro Phe Ser 

Trp He Leu 

Ala Leu Leu 
725 

Glu Asp Ser 
740 

Ala He Glu 
755 

His Gin Arg 

He Asp Tyr 

Asp He Tyr 
805 

Lys Thr Arg 

820 
Gly Met Ser 
835 

Ser Val Pro 
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Asp Asp Arg 

He Gly Arg 
440 

Thr Phe Lys 

455 
Pro Leu Leu 
470 

Asn Gin Ala 

Val Arg Pro 

Lys Asp Phe 
520 

Val Thr Val 
535 

Arc( Tyr Tyr 

Leu He Gly 

Gly Asn Gin 

Phe Asp Glu 
600 

Leu Pro Asn 

615 
Ser Asn He 
630 

Leu Ser Val 

Gly Ala Gin 

Lys His Lys 
680 

Gly Glu Thr 
695 

Gly Cys His 
710 

Lys Val Ser 

Tyr Glu Asp 

Pro Arg Ser 
760 

Glu He Thr 
775 

Asp Asp Thr 
790 

Asp Glu Asp 

His Hyr Phe 

Ser Ser Pro 
840 

Gin Phe Lys 
855 

FIG. 1 



410 
Ser Tyr 
425 

Lys Tyr 

Thr Arg 

Tyr Gly 

Ser Arg 
490 
Leu Tyr 
505 

Pro He 

Glu Asp 

Ser Ser 

Pro Leu 
570 
He Met 
585 

Asn Arg 

Pro Ala 

Met His 

Cys Leu 
650 
Thr Asp 
665 

Met Val 

Val Phe 

Asn Ser 

Ser Cys 
730 
He Ser 
745 

Phe Ser 

Arg Thr 

He Ser 

Glu Asn 
810 
He Ala 
825 

His Val- 
Lys Val 

0B 



Lys Ser 

Lys Lys 

Glu Ala 
460 
Glu Val 
475 

Pro Tyr 

Ser Arg 

Leu Pro 

Gly Pro 
540 
Phe Val 
555 

Leu He 

Ser Asp 

Ser Trp 

Gly Val 
620 
Ser He 
635 

His Glu 

Phe Leu 

Tyr Glu 

Met Ser 
700 
Asp Phe 
715 

Asp Lys 

Ala Tyr 

Gin Asn 

Thr Leu 
780 
Val Glu 
795 

Gin Ser 

Ala Val 

Leu Arg 

Val Phe 
860 



Gin Tyr 
430 
Val Arg 
445 

He Gin 

Gly Asp 

Asn He 

Arg Leu 
510 
Gly Glu 
525 

Thr Lys 

Asn Met 

Cys Tyr 

Lys Arg 
590 
Tyr Leu 
605 

Gin Leu 

Asn Gly 

Val Ala 

Ser Val 
670 
Asp Thr 
685 

Met Glu 

Arg Asn 

Asn Thr 

Leu Leu 
750 

Pro Pro 
765 

Gin Ser 

Met Lys 

Pro Arg 

Glu Ar 
83 

Asn Atg 
845 

Gin Glu 



415 

Leu Asn 

Phe Met 

His Glu 

Thr Leu 
480 
Tyr Pro 
495 

Pro Lys 

He Phe 

Ser Asp 

Glu Arg 
560 
Lys Glu 
575 

Asn Val 

Thr Glu 

Glu Asp 

Tyr Val 
640 
Tyr Trp 
655 

Phe Phe 

Leu Thr 

Asn Pro 

Arq Gly 
720 
Gly Asp 
735 

Ser Lys 

Val Leu 

Asp Gin 

Lys Glu 
800 
Ser Phe 
815 

Leu Trp 
Ala Gin 
Phe Thr 
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Asp Gly Ser Phe Thr Gin Pro Leu Tyr Arg Gly Glu Leu Asn Glu His 
865 870 ' 875 880 

Leu Gly Leu Leu Gly Pro Tyr He Arg Ala Glu Val Glu Asp Asn He 

885 890 895 

Met Val Thr Phe Arg Asn Gin Ala Ser Arg Pro Tyr Ser Phe Tyr Ser 

900 905 " 910 

Ser Leu He Ser Tyr Glu Glu Asp Gin Arg Gin Gly Ala Glu Pro Arg 

915 920 ~ 925 

Lys Asn Phe Val Lys Pro Asn Glu Thr Lys Thr Tyr Phe Trp Lys Val 

930 935 ' 940 ' 

Gin His His Met Ala Pro Thr Lys Asp Glu Phe Asp Cys Lys Ala Trp 
945 950 955 ~ 960 

Ala Tyr Phe Ser Asp Val Asp Leu Glu Lys Asp Val His Ser Gly Leu 

965 970 975 

He Gly Pro Leu Leu Val Cys His Thr Asn Thr Leu Asn Pro Ala His 

980 985 990 

Gly Arg Gin Val Thr Val Gin Glu Phe Ala Leu Phe Phe Thr He Phe 

995 _ 1000 1005 

Asp Glu Thr Lys Ser Trp Tyr Phe Thr Glu Asn Met Glu Arg Asn Cys 

1010 1015 1020 

Arg Ala Pro Cys Asn He Gin Met Glu Asp Pro Thr Phe Lys Glu Asn 
1025 1030 1035 " 1040 

Tyr Arg Phe His Ala He Asn Gly Tyr He Met Asp Thr Leu Pro Gly 

1045 ' 1050 1055 

Leu Val Met Ala Gin Asp Gin Arg He Arg Trp Tyr Leu Leu Ser Met 

1060 1065 1070 

Gly Ser Asn Glu Asn He His Ser He His Phe Ser Gly His Val Phe 

1075 1080 1085 

Thr Val Arg Lys Lys Glu Glu Tyr Lys Met Ala Leu Tyr Asn Leu Tyr 

1090 1095 1100 

Pro Gly Val Phe Glu Thr Val Glu Met Leu Pro Ser Lys Ala Gly He 
1105 1110 1115 * 1120 

Trp Arg Val Glu Cys Leu He Gly Glu His Leu His Ala Gly Met Ser 

1125 1130 1135 

Thr Leu Phe Leu Val Tyr Ser Asn Lys Cys Gin Thr Pro Leu Gly Met 

1140 1145 1150 

Ala Ser Gly His He Arg Asp Phe Gin He Thr Ala Ser Gly Gin Tyr 

1155 1160 1165 

Gly Gin Trp Ala Pro Lys Leu Ala Arg Leu His Tyr Ser Gly Ser He 

1170 1175 1180 

Asn Ala Trp Ser Thr Lys Glu Pro Phe Ser Trp He Lys Val Asp Leu 
1185 1190 1195 ' 1200 

Leu Ala Pro Met He He His Gly He Lys Thr Gin Gly Ala Arg Gin 

1205 1210 ' 1215 

Lys Phe Ser Ser Leu Tyr He Ser Gin Phe He He Met Tyr Ser Leu 

1220 ' 1225 1230 

Asp Gly Lys Lys Trp Gin Thr Tyr Arg Gly Ash Ser Thr Gly Thr Leu 

1235 1240 1245 

Met Val Phe Phe Gly Asn Val Asp Ser Ser Gly He Lys His Asn He 

1250 1255 1260 

Phe Asn Pro Pro He He Ala Arg Tyr He Arg Leu His Pro Thr His 
1265 1270 1275 1280 

Tyr Ser He Arg Ser Thr Leu Arg Met Glu Leu Met Gly Cys Asp Leu 

1285 1290 1295 

Asn Ser Cys Ser Met Pro Leu Gly Met Glu Ser Lys Ala He Ser Asp 

1300 1305 1310 

Ala Gin He Thr Ala Ser Ser Tyr Phe Thr Asn Met Phe Ala Thr Trp 
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1 



j 



1315 1320 1325 

Ser Pro Ser Lys Ala Arq Leu His Leu Gin Gly Arq Ser Asn Ala Trp 

1330 1335 1340 

Arq Pro Gin Val Asn Asn Pro Lys Glu Trp Leu Gin Val Asp Phe Gin 
1345 1350 1355 1360 

Lys Thr Met Lys Val Thr Gly Val Thr Thr Gin Gly Val Lys Ser Leu 

1365 1370 1375 

Leu Thr Ser Met Tyr Val Lys Glu Phe Leu He Ser Ser Ser Gin Asp 

1380 1385 1390 

Gly His Gin Trp Thr Leu Phe Phe Gin Asn Gly Lys Val Lys Val Phe 

1395 1400 1405 

Gin Gly Asn Gin Asp Ser Phe Thr Pro Val Val Asn Ser Leu Asp Pro 

1410 1415 1420 

Pro Leu Leu Thr Arq Tyr Leu Arq He His Pro Gin Ser Trp Val His 
1425 1430 ~ 1435 1440 

Gin He Ala Leu Arq Met Glu Val Leu Gly Cys Glu Ala Gin Asp Leu 

1445 1450 1455 

Tyr 
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NUCLEOTIDE SEQUENCE OF HUMAN B DOMAIN-DELETED FACTOR VIII (HSQ) 

1 ATGCAAATAG AGCTCTCCAC CTGCTTCTTT CTGTGCCTTT TGCGATTCTG 

51 CTTTAGTGCC ACCAGAAGAT ACTACCTGGG TGCAGTGGAA CTGTCATGGG 

101 ACTATATGCA AAGTGATCTC GGTGAGCTGC CTGTGGACGC AAGATTTCCT 

151 CCTAGAGTGC CAAAATCTTT TCCATTCAAC ACCTCAGTCG TGTACAAAAA 

201 GACTCTGTTT GTAGAATTCA CGGTTCACCT TTTCAACATC GCTAAGCCAA 

251 GGCCACCCTG GATGGGTCTG ' CTAGGTCCTA CCATCCAGGC TGAGGTTTAT 

301 GATACAGTGG TCATTACACT TAAGAACATG GCTTCCCATC CTGTCAGTCT 

351 TCATGCTGTT GGTGTATCCT ACTGGAAAGC TTCTGAGGGA GCTGAATATG 

401 ATGATCAGAC CAGTCAAAGG GAGAAAGAAG ATGATAAAGT CTTCCCTGGT 

451 GGAAGCCATA CATATGTCTG GCAGGTCCTG AAAGAGAATG GTCCAATGGC 

501 CTCTGACCCA CTGTGCCTTA CCTACTCATA TCTTTCTCAT GTGGACCTGG 

551 TAAAAGACTT GAATTCAGGC CTCATTGGAG CCCTACTAGT ATGTAGAGAA 

601 GGGAGTCTGG CCAAGGAAAA GACACAGACC TTGCACAAAT TTATACTACT 

651 TTTTGCTGTA TTTGATGAAG GGAAAAGTTG GCACTCAGAA ACAAAGAACT 

701 CCTTGATGCA GGATAGGGAT GCTGCATCTG CTCGGGCCTG GCCTAAAATG 

751 CACACAGTCA ATGGTTATGT AAACAGGTCT CTGCCAGGTC TGATTGGATG 

801 CCACAGGAAA TCAGTCTATT GGCATGTGAT TGGAATGGGC ACCACTCCTG 

851 AAGTGCACTC AATATTCCTC GAAGGTCACA CATTTCTTGT GAGGAACCAT 

901 CGCCAGGCGT CCTTGGAAAT CTCGCCAATA ACTTTCCTTA CTGCTCAAAC 

951 ACTCTTGATG GACCTTGGAC AGTTTCTACT GTTTTGTCAT ATCTCTTCCC 

1001 ACCAACATGA TGGCATGGAA GCTTATGTCA AAGTAGACAG CTGTCCAGAG 

1051 GAACCCCAAC TACGAATGAA AAATAATGAA GAAGCGGAAG ACTATGATGA 

1101 TGATCTTACT GATTCTGAAA TGGATGTGGT CAGGTTTGAT GATGACAACT 

1151 CTCCTTCCTT TATCCAAATT CGCTCAGTTG CCAAGAAGCA TCCTAAAACT 

1201 TGGGTACATT ACATTGCTGC TGAAGAGGAG GACTGGGACT ATGCTCCCTT 

1251 AGTCCTCGCC CCCGATGACA GAAGTTATAA AAGTCAATAT TTGAACAATG 

1301 GCCCTCAGCG GATTGGTAGG AAGTACAAAA AAGTCCGATT TATGGCATAC 

1351 ACAGATGAAA CCTTTAAGAC GCGTGAAGCT ATTCAGCATG AATCAGGAAT 

1401 CTTGGGACCT TTACTTTATG GGGAAGTTGG AGACACACTG TTGATTATAT 

1451 TTAAGAATCA AGCAAGCAGA CCATATAACA TCTACCCTCA CGGAATCACT 

1501 GATGTCCGTC CTTTGTATTC AAGGAGATTA CCAAAAGGTG TAAAACATTT 

1551 GAAGGATTTT CCAATTCTGC CAGGAGAAAT ATTCAAATAT AAATGGACAG 

1601 TGACTGTAGA AGATGGGCCA ACTAAATCAG ATCCGCGGTG CCTGACCCGC 

1651 TATTACTCTA GTTTCGTTAA TATGGAGAGA GATCTAGCTT CAGGACTCAT 

1701 TGGCCCTCTC CTCATCTGCT ACAAAGAATC TGTAGATCAA AGAGGAAACC 

1751 AGATAATGTC AGACAAGAGG AATGTCATCC TGTTTTCTGT ATTTGATGAG 

1801 AACCGAAGCT GGTACCTCAC AGAGAATATA CAACGCTTTC TCCCCAATCC 

1851 AGCTGGAGTG CAGCTTGAGG ATCCAGAGTT CCAAGCCTCC AACATCATGC 

1901 ACAGCATCAA TGGCTATGTT TTTGATAGTT TGCAGTTGTC AGTTTGTTTG 

1951 CATGAGGTGG CATACTGGTA CATTCTAAGC ATTGGAGCAC AGACTGACTT 

2001 CCTTTCTGTC TTCTTCTCTG GATATACCTT CAAACACAAA ATGGTCTATG 

2051 AAGACACACT CACCCTATTC CCATTCTCAG GAGAAACTGT CTTCATGTCG 

2100 ATGGAAAACC CAGGTCTATG GATTCTGGGG TGCCACAACT CAGACTTTCG 

2151 GAACAGAGGC ATGACCGCCT TACTGAAGGT TTCTAGTTGT GACAAGAACA 

2201 CTGGTGATTA TTACGAGGAC AGTTATGAAG ATATTTCAGC ATACTTGCTG 

2251 AGTAAAAACA ATGCCATTGA ACCTAGGAGC TTCTCTCAGA ATCCACCAGT 

2301 CTTGAAACGC CATCAACGGG AAATAACTCG TACTACTCTT CAGTCAGATC 

2351 AAGAGGAAAT TGACTATGAT GATACCATAT CAGTTGAAAT GAAGAAGGAA 

2401 GATTTTGACA TTTATGATGA GGATGAAAAT CAGAGCCCCC GCAGCTTTCA 
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2451 AAAGAAAACA CGACACTATT TTATTGCTGC AGTGGAGAGG CTCTGGGATT 

2501 ATGGGATGAG TAGCTCCCCA CATGTTCTAA GAAACAGGGC TCAGAGTGGC 

2551 AGTGTCCCTC AGTTCAAGAA AGTTGTTTTC CAGGAATTTA CTGATGGCTC 

2601 CTTTACTCAG CCCTTATACC GTGGAGAACT AAATGAACAT TTGGGACTCC 

2651 TGGGGCCATA TATAAGAGCA GAAGTTGAAG ATAATATCAT GGTAACTTTC 

2701 AGAAATCAGG CCTCTCGTCC CTATTCCTTC TATTCTAGCC TTATTTCTTA 

2751 TGAGGAAGAT CAGAGGCAAG GAGCAGAACC TAGAAAAAAC TTTGTCAAGC 

2801 CTAATGAAAC CAAAACTTAC TTTTGGAAAG TGCAACATCA TATGGCACCC 

2851 ACTAAAGATG AGTTTGACTG CAAAGCCTGG GCTTATTTCT CTGATGTTGA 

2901 CCTGGAAAAA GATGTGCACT CAGGCCTGAT TGGACCCCTT CTGGTCTGCC 

2951 ACACTAACAC ACTGAACCCT GCTCATGGGA GACAAGTGAC AGTACAGGAA 

3001 TTTGCTCTGT TTTTCACCAT CTTTGATGAG ACCAAAAGCT GGTACTTCAC 

3051 TGAAAATATG GAAAGAAACT GCAGGGCTCC CTGCAATATC CAGATGGAAG 

3101 ATCCCACTTT TAAAGAGAAT TATCGCTTCC ATGCAATCAA TGGCTACATA 

3151 ATGGATACAC TACCTGGCTT AGTAATGGCT CAGGATCAAA GGATTCGATG 

3201 GTATCTGCTC AGCATGGGCA GCAATGAAAA CATCCATTCT ATTCATTTCA 

3251 GTGGACATGT GTTCACTGTA CGAAAAAAAG AGGAGTATAA AATGGCACTG 

3301 TACAATCTCT ATCCAGGTGT TTTTGAGACA GTGGAAATGT TACCATCCAA 

3351 AGCTGGAATT TGGCGGGTGG AATGCCTTAT TGGCGAGCAT CTACATGCTG 

3401 GGATGAGCAC ACTTTTTCTG GTGTACAGCA ATAAGTGTCA GACTCCCCTG 

3451 GGAATGGCTT CTGGACACAT TAGAGATTTT CAGATTACAG CTTCAGGACA 

3501 ATATGGACAG TGGGCCCCAA AGCTGGCCAG ACTTCATTAT TCCGGATCAA 

3551 TCAATGCCTG GAGCACCAAG GAGCCCTTTT CTTGGATCAA GGTGGATCTG 

3601 TTGGCACCAA TGATTATTCA CGGCATCAAG ACCCAGGGTG CCCGTCAGAA 

3651 GTTCTCCAGC CTCTACATCT CTCAGTTTAT CATCATGTAT AGTCTTGATG 

3701 GGAAGAAGTG GCAGACTTAT CGAGGAAATT CCACTGGAAC CTTAATGGTC 

3751 TTCTTTGGCA ATGTGGATTC ATCTGGGATA AAACACAATA TTTTTAACCC 

3801 TCCAATTATT GCTCGATACA TCCGTTTGCA CCCAACTCAT TATAGCATTC 

3851 GCAGCACTCT TCGCATGGAT TTGATGGGCT GTGATTTAAA TAGTTGCAGC 

3901 ATGCCATTGG GAATGGAGAG TAAAGCAATA TCAGATGCAC AGATTACTGC 

3951 TTCATCCTAC TTTACCAATA TGTTTGCCAC CTGGTCTCCT TCAAAAGCTC 

4001 GACTTCACCT CCAAGGGAGG AGTAATGCCT GGAGACCTCA GGTGAATAAT 

4051 CCAAAAGAGT GGCTGCAAGT GGACTTCCAG AAGACAATGA AAGTCACAGG 

4101 AGTAACTACT CAGGGAGTAA AATCTCTGCT TACCAGCATG TATGTGAAGG 

4151 AGTTCCTCAT CTCCAGCAGT CAAGATGGCC ATCAGTGGAC TCTCTTTTTT 

4201 CAGAATGGCA AAGTAAAGGT TTTTCAGGGA AATCAAGACT CCTTCACACC 

4251 TGTGGTGAAC TCTCTAGACC CACCGTTACT GACTCGCTAC CTTCGAATTC 

4301 ACCCCCAGAG TTGGGTGCAC CAGATTGCCC TGAGGATGGA GGTTCTGGGC 

4351 TGCGAGGCAC AGGACCTCTA C 
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HSQ(HHHHH) 





HP47/OL(PPPHH) 
HP46/SQ(PHHHH) 
HP1/SQ(HPHHH) 
HP63/OL 
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AMINO ACID SEQUENCE OF HP630L 

1 MQLELSTCVF LCLLPLGFSA IRRYYLGAVE LSWDYRQSEL LRELHVDTRF 

51 PATAPGALPL GPSVLYKKTV FVEFTDQLFS VARPRPPWMG LLGPTIQAEV 

101 YDTVWTLKN MASHPVSLHA VGVSFWKSSE GAEYEDHTSQ REKEDDKVLP 

151 GKSQTYVWQV LKENGPTASD PPCLTYSYLS HVDLVKDLNS GLIGALLVCR 

201 EGSLTRERTQ NLHEFVLLFA VFDEGKSWHS ARNDSWTRAM DPAPARAQPA 

251 MHTVNGYVNR SLPGLIGCHK KSVYWHVIGM GTSPEVHSIF LEGHTFLVRH 

301 HRQASLEISP LTFLTAQTFL MDLGQFLLFC HISSHHHGGM EAHVRVESCA 

351 EEPQLRRKAD EEEDYDDNLY DSDMDWRLD GDDVSPFIQI RSVAKKHPKT 

401 WVHYIAAEEE DWDYAPLVLA PDDRSYKSQY LNNGPQRIGR KYKKVRFMAY 

451 TDETFKTREA IQHESGILGP LLYGEVGDTL LIIFKNQASR PYNIYPHGIT 

501 DVRPLYSRRL PKGVKHLKDF PILPGEIFKY KWTVTVEDGP TKSDPRCLTR 

551 YYSSFVNMER DLASGLIGPL LICYKESVDQ RGNQIMSDKR NVILFSVFDE 

601 NRSWYLTENI QRFLPNPAGV QLEDPEFQAS NIMHSINGYV FDSLQLSVCL 

651 HEVAYWYILS IGAQTDFLSV FFSGYTFKHK MVYEDTLTLF PFSGETVFMS 

701 MENPGLWILG CHNSDFRNRG MTALLKVSSC DKNTGDYYED SYEDISAYLL 

751 SKNNAIEPRS FSQNSRHPST RSQNPPVLKR HQREITRTTL QSDQEEIDYD 

801 DTISVEMKKE DFDIYDEDEN QSPRSFQKRT RHYFIAAVEQ LWDYGMSESP 

851 RALRNRAQNG EVPRFKKWF REFADGSFTQ PSYRGELNKH LGLLGPYIRA 

901 EVEDNIMVTF KNQASRPYSF YSSLISYPDD QEQGAEPRKN FVKPNETKTY 

951 FWKVQHHMAP TEDEFDCKAW AYFSDVDLEK DVHSGLIGPL LICRANTLNA 

1001 AHGRQVTVQE FALFFTIFDE TKSWYFTENV ERNCRAPCHL QMEDPTLKEN 

1051 YRFHAINGYV MDTLPGLVMA QNQRIRWYLL SMGSNENIHS IHFSGHVFSV 

1101 RKKEEYKMAV YNLYPGVFET VEMLPSKVGI WRNRCLIGEH LQAGMSTTFL 

1151 VYSKKCQTPL GMASGHIRDF QITASGQYGQ WAPKLARLHY SGSINAWSTK 

1201 EPFSWIKVDL LAPMIIHGIK TQGARQKFSS LYISQFIIMY SLDGKKWQTY 

1251 RGNSTGTLMV FFGNVDSSGI KHNIFNPPII ARYIRLHPTH YSIRSTLRME 

1301 LMGCDLNSCS MPLGMESKAI SDAQITASSY FTNMFATWSP SKARLHLQGR 

1351 SNAWRPQVNN PKEWLQVDFQ KTMKVTGVTT QGVKSLLTSM YVKEFLISSS 

1401 QDGHQWTLFF QNGKVKVFQG NQDSFTPWN SLDPPLLTRY LRIHPQSWVH 

1451 QIALRMEVLG CEAQDLY 

FIG. 13 
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NUCLEOTIDE SEQUENCE OF HP63/OL 

1 ATGCAGCTAG AGCTCTCCAC CTGTGTCTTT CTGTGTCTCT TGCCACTCGG 

51 CTTTAGTGCC ATCAGGAGAT ACTACCTGGG CGCAGTGGAA CTGTCCTGGG 

101 ACTACCGGCA AAGTGAACTC CTCCGTGAGC TGCACGTGGA CACCAGATTT 

151 CCTGCTACAG CGCCAGGAGC TCTTCCGTTG GGCCCGTCAG TCCTGTACAA 

201 AAAGACTGTG TTCGTAGAGT TCACGGATCA ACTTTTCAGC GTTGCCAGGC 

251 CCAGGCCACC ATGGATGGGT CTGCTGGGTC CTACCATCCA GGCTGAGGTT 

3 01 TACGACACGG TGGTCGTTAC CCTGAAGAAC ATGGCTTCTC ATCCCGTTAG 

351 TCTTCACGCT GTCGGCGTCT CCTTCTGGAA ATCTTCCGAA GGCGCTGAAT 

401 ATGAGGATCA CACCAGCCAA AGGGAGAAGG AAGACGATAA AGTCCTTCCC 

451 GGTAAAAGCC AAACCTACGT CTGGCAGGTC CTGAAAGAAA ATGGTCCAAC 

501 AGCCTCTGAC CCACCATGTC TTACCTACTC ATACCTGTCT CACGTGGACC 

551 TGGTGAAAGA CCTGAATTCG GGCCTCATTG GAGCCCTGCT GGTTTGTAGA 

601 GAAGGGAGTC TGACCAGAGA AAGGACCCAG AACCTGCACG AATTTGTACT 

651 ACTTTTTGCT GTCTTTGATG AAGGGAAAAG TTGGCACTCA GCAAGAAATG 

701 ACTCCTGGAC ACGGGCCATG GATCCCGCAC CTGCCAGGGC CCAGCCTGCA 

751 ATGCACACAG TCAATGGCTA TGTCAACAGG TCTCTGCCAG GTCTGATCGG 

801 ATGTCATAAG AAATCAGTCT ACTGGCACGT GATTGGAATG GGCACCAGCC 

851 CGGAAGTGCA CTCCATTTTT CTTGAAGGCC ACACGTTTCT CGTGAGGCAC 

901 CATCGCCAGG CTTCCTTGGA GATCTCGCCA CTAACTTTCC TCACTGCTCA 

951 GACATTCCTG ATGGACCTTG GCCAGTTCCT ACTGTTTTGT CATATCTCTT 

1001 CCCACCACCA TGGTGGCATG GAGGCTCACG TCAGAGTAGA AAGCTGCGCC 

1051 GAGGAGCCCC AGCTGCGGAG GAAAGCTGAT GAAGAGGAAG ATTATGATGA 

1101 CAATTTGTAC GACTCGGACA TGGACGTGGT CCGGCTCGAT GGTGACGACG 

1151 TGTCTCCCTT TATCCAAATC CGCTCAGTTG CCAAGAAGCA TCCTAAAACT 

1201 TGGGTACATT ACATTGCTGC TGAAGAGGAG GACTGGGACT ATGCTCCCTT 

1251 AGTCCTCGCC CCCGATGACA GAAGTTATAA AAGTCAATAT TTGAACAATG 

13 01 GCCCTCAGCG GATTGGTAGG AAGTACAAAA AAGTCCGATT TATGGCATAC 
1351 ACAGATGAAA CCTTTAAGAC TCGTGAAGCT ATTCAGCATG AATCAGGAAT 

14 01 CTTGGGACCT TTACTTTATG GGGAAGTTGG AGACACACTG TTGATTATAT 
1451 TTAAGAATCA AGCAAGCAGA CCATATAACA TCTACCCTCA CGGAATCACT 
1501 GATGTCCGTC CTTTGTATTC AAGGAGATTA CCAAAAGGTG TAAAACATTT 
1551 GAAGGATTTT CCAATTCTGC CAGGAGAAAT ATTCAAATAT AAATGGACAG 
1601 TGACTGTAGA AGATGGGCCA ACTAAATCAG ATCCTCGGTG CCTGACCCGC 
1651 TATTACTCTA GTTTCGTTAA TATGGAGAGA GATCTAGCTT CAGGACTCAT 
1701 TGGCCCTCTC CTCATCTGCT ACAAAGAATC TGTAGATCAA AGAGGAAACC 
1751 AGATAATGTC AGACAAGAGG AATGTCATCC TGTTTTCTGT ATTTGATGAG 
1801 AACCGAAGCT GGTACCTCAC AGAGAATATA CAACGCTTTC TCCCCAATCC 
1851 AGCTGGAGTG CAGCTTGAGG ATCCAGAGTT CCAAGCCTCC AACATCATGC 
1901 ACAGCATCAA TGGCTATGTT TTTGATAGTT TGCAGTTGTC AGTTTGTTTG 
1951 CATGAGGTGG CATACTGGTA CATTCTAAGC ATTGGAGCAC AGACTGACTT 
2001 CCTTTCTGTC TTCTTCTCTG GATATACCTT CAAACACAAA ATGGTCTATG 
2051 AAGACACACT CACCCTATTC CCATTCTCAG GAGAAACTGT CTTCATGTCG 
2101 ATGGAAAACC CAGGTCTATG GATTCTGGGG TGCCACAACT CAGACTTTCG 
2151 GAACAGAGGC ATGACCGCCT TACTGAAGGT TTCTAGTTGT GACAAGAACA 
2201 CTGGTGATTA TTACGAGGAC AGTTATGAAG ATATTTCAGC ATACTTGCTG 
2251 AGTAAAAACA ATGCCATTGA ACCTAGGAGC TTCTCCCAGA ATTCAAGACA 
2301 CCCTAGCACT AGGTCTCAAA ACCCACCAGT CTTGAAACGC CATCAACGGG 
2351 AAATAACTCG TACTACTCTT CAGTCAGATC AAGAGGAAAT TGACTATGAT 
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2401 GATACCATAT CAGTTGAAAT GAAGAAGGAA GATTTTGACA TTTATGATGA 

2451 GGATGAAAAT CAGAGCCCCC GCAGCTTTCA AAAGAGAACC CGACACTATT 

2501 TCATTGCTGC GGTGGAGCAG CTCTGGGATT ACGGGATGAG CGAATCCCCC 

2551 CGGGCGCTAA GAAACAGGGC TCAGAACGGA GAGGTGCCTC GGTTCAAGAA 

2601 GGTGGTCTTC CGGGAATTTG CTGACGGCTC CTTCACGCAG CCGTCGTACC 

2651 GCGGGGAACT CAACAAACAC TTGGGGCTCT TGGGACCCTA CATCAGAGCG 

2701 GAAGTTGAAG ACAACATCAT GGTAACTTTC AAAAACCAGG CGTCTCGTCC 

2751 CTATTCCTTC TACTCGAGCC TTATTTCTTA TCCGGATGAT CAGGAGCAAG 

2801 GGGCAGAACC TCGAAAAAAC TTTGTCAAGC CTAATGAAAC CAAAACTTAC 

2851 TTTTGGAAAC TGCAGCATCA CATGGCACCC ACAGAAGACG AGTTTGACTG 

2901 CAAAGCCTGG GCCTACTTTT CTGATGTTGA CCTGGAAAAA GATGTGCACT 

2951 CAGGCTTGAT CGGCCCCCTT CTGATCTGCC GCGCCAACAC CCTGAACGCT 

3001 GCTCACGGTA GACAAGTGAC CGTGCAAGAA TTTGCTCTGT TTTTCACTAT 

3051 TTTTGATGAG ACAAAGAGCT GGTACTTCAC TGAAAATGTG GAAAGGAACT 

3101 GCCGGGCCCC CTGCCATCTG CAGATGGAGG ACCCCACTCT GAAAGAAAAC 

3151 TATCGCTTCC ATGCAATCAA TGGCTATGTG ATGGATACAC TCCCTGGCTT 

3201 AGTAATGGCT CAGAATCAAA GGATCCGATG GTATCTGCTC AGCATGGGCA 

3251 GCAATGAAAA TATCCATTCG ATTCATTTTA GCGGACACGT GTTCAGTGTA 

3301 CGGAAAAAGG AGGAGTATAA AATGGCCGTG TACAATCTCT ATCCGGGTGT 

3351 CTTTGAGACA GTGGAAATGC TACCGTCCAA AGTTGGAATT TGGCGGAATA 

3401 GATGCCTGAT TGGCGAGCAC CTGCAAGCTG GGATGAGCAC GACTTTCCTG 

3451 GTGTACAGCA AGAAGTGTCA GACTCCCCTG GGAATGGCTT CTGGACACAT 

3501 TAGAGATTTT CAGATTACAG CTTCAGGACA ATATGGACAG TGGGCCCCAA 

3551 AGCTGGCCAG ACTTCATTAT TCCGGATCAA TCAATGCCTG GAGCACCAAG 

3601 GAGCCCTTTT CTTGGATCAA GGTGGATCTG TTGGCACCAA TGATTATTCA 

3651 CGGCATCAAG ACCCAGGGTG CCCGTCAGAA GTTCTCCAGC CTCTACATCT 

3701 CTCAGTTTAT CATCATGTAT AGTCTTGATG GGAAGAAGTG GCAGACTTAT 

3751 CGAGGAAATT CCACTGGAAC CTTAATGGTC TTCTTTGGCA ATGTGGATTC 

3801 ATCTGGGATA AAACACAATA TTTTTAACCC TCCAATTATT GCTCGATACA 

3851 TCCGTTTGCA CCCAACTCAT TATAGCATTC GCAGCACTCT TCGCATGGAG 

3901 TTGATGGGCT GTGATTTAAA TAGTTGCAGC ATGCCATTGG GAATGGAGAG 

3 951 TAAAGCAATA TCAGATGCAC AGATTACTGC TTCATCCTAC TTTACCAATA 

4001 TGTTTGCCAC CTGGTCTCCT TCAAAAGCTC GACTTCACCT CCAAGGGAGG 

4051 AGTAATGCCT GGAGACCTCA GGTGAATAAT CCAAAAGAGT GGCTGCAAGT 

4101 GGACTTCCAG AAGACAATGA AAGTCACAGG AGTAACTACT CAGGGAGTAA 

4151 AATCTCTGCT TACCAGCATG TATGTGAAGG AGTTCCTCAT CTCCAGCAGT 

4201 CAAGATGGCC ATCAGTGGAC TCTCTTTTTT CAGAATGGCA AAGTAAAGGT 

4251 TTTTCAGGGA AATCAAGACT CCTTCACACC TGTGGTGAAC TCTCTAGACC 

43 01 CACCGTTACT GACTCGCTAC CTTCGAATTC ACCCCCAGAG TTGGGTGCAC 

4351 CAGATTGCCC TGAGGATGGA GGTTCTGGGC TGCGAGGCAC AGGACCTCTA 

4401 C 
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